References: Building an autograd engine: micrograd
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• Andrej Karpathy, "Neural Networks: Zero to Hero", Lecture 1: "The spelled-out intro to neural networks and backpropagation: building micrograd" Creator: Andrej Karpathy Video: https://www.youtube.com/watch?v=VMj-3S1tku0 Code repo: https://github.com/karpathy/micrograd (MIT License) Series page: https://karpathy.ai/zero-to-hero.html License: micrograd code is MIT-licensed; the video lecture is YouTube standard.Clawdemy's lessons are original prose that follows the pedagogical arc of thisseries. We do not reproduce or transcribe the videos or the code; we cite themas the recommended companion. The worked expression in this lesson follows theexample Karpathy uses in the lecture. All rights to the original video and coderemain with the creator.Watch this next
Section titled “Watch this next”- The spelled-out intro to neural networks and backpropagation: building micrograd (Andrej Karpathy) by Andrej Karpathy. The lecture this lesson mirrors. Karpathy builds the whole engine live in a Jupyter notebook, typing every line and explaining each one, then trains a small neural net with it. Long (about 2.5 hours) but worth it: this lesson covers the autograd-engine half (roughly the first part); watching Karpathy type the backward pass and see the gradients populate is the clearest way to make it concrete.
Going deeper
Section titled “Going deeper”-
micrograd on GitHub (MIT License). The complete engine, around 150 lines, plus a small demo. Reading the
backward()method after this lesson is the fastest way to confirm that the procedure really is just “local derivative times incoming gradient, walked in reverse.” The code is short enough to read in one sitting. -
Neural Networks: Zero to Hero (full series) by Andrej Karpathy. The series this track follows, from micrograd through building a GPT. The next lecture moves from the autograd engine to language modeling.
Adjacent topics
Section titled “Adjacent topics”Where this sits in the curriculum.
-
The chain rule (calculus track). Backpropagation is the chain rule applied node by node: “rates multiply through a composition” is exactly what the backward pass does as it multiplies each incoming gradient by a local derivative. If the backward walk felt fast, a reread of the chain-rule lesson grounds it.
-
What a neural network is (neural-network-intuition track). That track showed the shape of a network and that it learns by adjusting weights. This lesson is the missing mechanism: how the gradients that drive that adjustment are actually computed. The next lesson assembles the engine into a network and trains it.