References: Weights, biases, and the squish

Source material

Source curriculum (structural mirror, cited as further study):
• 3Blue1Brown, Neural Networks, Chapter 1: "But what is a Neural Network?"
  Creator: Grant Sanderson (text adaptation by Josh Pullen)
  Lesson page: https://www.3blue1brown.com/lessons/neural-networks
  Series index: https://www.3blue1brown.com/?topic=neural-networks
  License: copyright Grant Sanderson; videos published on his site and YouTube
This lesson mirrors the weights-and-biases portion of Chapter 1. Clawdemy's
lessons are original prose that follows the pedagogical arc of this series. We
do not reproduce or transcribe the videos; we cite them as the recommended
companion. All rights to the original videos remain with the creator.

Watch this next

But what is a Neural Network? (3Blue1Brown) by Grant Sanderson. The chapter this lesson mirrors. The later part walks through weights and biases visually, including the “weights as a pixel template” picture, and animates the weighted sum and the sigmoid squish. Watching the weighted sum light up across a real input layer is the fastest way to make this lesson’s arithmetic feel intuitive.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

TensorFlow Playground. The browser network from the last lesson, now with more to see: hover a connection and you can read its weight, and you can watch how changing weights changes what each neuron responds to. A hands-on way to feel what weights do before any math.
Neural Networks and Deep Learning, Chapter 1 by Michael Nielsen. Works through the same weighted-sum-plus-bias-plus-activation computation in careful prose, and explains why the sigmoid’s smooth shape is convenient. The natural deeper read on exactly this lesson’s mechanism.

Adjacent topics

Where this leads inside this track.

The whole network as one function (lesson 4). This lesson gave the formula for a single neuron. Lesson 4 zooms back out: stack that formula across every neuron and every layer, and the entire network becomes one big function from 784 inputs to 10 outputs, with all ~13,000 parameters as its adjustable knobs.
What “learning” really means (lesson 5). We said the right parameter values are “found from examples” and stopped there. Lesson 5 opens that up: learning is the process of adjusting all those weights and biases to make the network’s guesses on the labeled examples less wrong.