References: What learning really means
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• 3Blue1Brown, Neural Networks, Chapter 2: "Gradient descent, how neural networks learn" Creator: Grant Sanderson (text adaptation by Josh Pullen) Lesson page: https://www.3blue1brown.com/lessons/gradient-descent Series index: https://www.3blue1brown.com/?topic=neural-networks License: copyright Grant Sanderson; videos published on his site and YouTubeThis lesson mirrors the opening of Chapter 2, where the cost function isintroduced and learning is framed as minimizing it. Clawdemy's lessons areoriginal prose that follows the pedagogical arc of this series. We do notreproduce or transcribe the videos; we cite them as the recommended companion.All rights to the original videos remain with the creator.Watch this next
Section titled “Watch this next”- Gradient descent, how neural networks learn (3Blue1Brown) by Grant Sanderson. Chapter 2 of the series, and the source for this lesson and the next two. The opening minutes introduce the cost idea visually: you watch a confident-and-correct output score low and a confused output score high. Watch up to where the “cost landscape” picture appears, then come back for lesson 6, which is exactly that picture.
Going deeper
Section titled “Going deeper”A short, durable list. Each link is a specific next step, not a generic pile.
-
Neural Networks and Deep Learning, Chapter 1 (the “cost function” section) by Michael Nielsen. Introduces the same squared-difference cost (Nielsen calls it the quadratic cost) and explains carefully why a smooth cost is what makes learning tractable. The natural deeper read on this exact idea.
-
TensorFlow Playground. Run a network and watch the “Training loss” number in the corner fall as it learns. That falling number is a cost exactly like the one in this lesson. Seeing it drop in real time makes “learning is minimizing a number” concrete.
Adjacent topics
Section titled “Adjacent topics”Where this leads inside this track.
-
The cost landscape (lesson 6). This lesson said the cost is a function of about 13,000 parameters. Lesson 6 turns that into a picture: a landscape over the space of all possible parameter settings, where height is cost, and the goal is to find a low valley.
-
Gradient descent (lesson 7). Once you can picture the landscape, you need a way to actually walk downhill in it without being able to see the whole thing at once. Lesson 7 is that method, the algorithm that gives this whole chapter its name.