References: Backpropagation and the chain rule
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• 3Blue1Brown, Neural Networks, Chapter 5: "Backpropagation calculus" Creator: Grant Sanderson (text adaptation by Josh Pullen) Lesson page: https://www.3blue1brown.com/lessons/backpropagation-calculus Series index: https://www.3blue1brown.com/?topic=neural-networks License: copyright Grant Sanderson; videos published on his site and YouTubeThis lesson mirrors the backpropagation-calculus chapter (the companion to theintuition chapter mirrored in the previous lesson). Note: live 3B1B Chapter 3("Analyzing our neural network", at /lessons/neural-network-analysis) sitsbetween the gradient-descent chapter (Ch2, mirrored in T11 lesson 7) and thebackpropagation chapter (Ch4, mirrored in lesson 8). T11 deliberately does notmirror Ch3 as a standalone lesson; its central insight, that a trainednetwork's hidden layers do not cleanly detect "edges then loops" as thehopeful story suggests, is folded into lesson 2 as the "hold the edges-to-loops story loosely" caveat. Live Ch3-5 numbering verified 2026-05-25.Clawdemy's lessons are original prose that follows the pedagogical arc of thisseries. We do not reproduce or transcribe the videos; we cite them as therecommended companion. All rights to the original videos remain with the creator.Watch this next
Section titled “Watch this next”- Backpropagation calculus (3Blue1Brown) by Grant Sanderson. The chapter this lesson mirrors. It works the chain rule through a small chain of neurons with the activation functions included, and shows the notation laid out cleanly. If the worked chain here clicked, the video is the next level of the same example with every factor, including the squish, in place.
Going deeper
Section titled “Going deeper”A short, durable list. Each link is a specific next step, not a generic pile.
-
The chain rule itself: Clawdemy Track 8 (Visual Math: Calculus) and 3Blue1Brown’s Essence of Calculus by Grant Sanderson. This lesson used the chain rule as a given. If “rates multiply along a chain” felt like a leap, this is where the chain rule gets built from the ground up, with the geometric intuition behind it.
-
Neural Networks and Deep Learning, Chapter 2 (the backprop equations) by Michael Nielsen. Derives the four backpropagation equations in full, including the activation-function factor we only mentioned here. The natural deeper read for anyone who wants the complete, general form rather than the one-neuron-per-layer chain.
Adjacent topics
Section titled “Adjacent topics”Where this leads inside this track.
-
What backpropagation is really doing (lesson 8). The previous lesson told the story (desires propagating backward); this one supplied the arithmetic (the chain rule). Read them as one pair: intuition, then the math that makes it precise.
-
Gradient descent (lesson 7). The chain rule produces each knob’s slope, which is one component of the gradient. Lesson 7 is what consumes those slopes to take a step. Together, lessons 5 through 9 are the complete training loop.
-
Seeing it whole (lesson 10). The final lesson assembles every piece, from a messy handwritten 3 to a trained network, and points you toward building one yourself.