Skip to content

References: Neural networks and backpropagation

This lesson follows Stanford CS231n’s treatment of neural networks and backpropagation, the closing pair of Phase 1 in the course.

  • Course: Stanford CS231n, “Deep Learning for Computer Vision”
  • Instructors: Fei-Fei Li, Ehsan Adeli, and Justin Johnson (Stanford University)
  • Course site: cs231n.stanford.edu
  • Course notes (backpropagation): cs231n.github.io/optimization-2 (the chain rule through computational graphs, the local-vs-upstream gradient pattern at every gate, the (x+y)·z worked circuit).
  • Course notes (neural networks): cs231n.github.io/neural-networks-1 (the two-layer NN architecture, the role of the non-linearity, hidden-layer representations).
  • This lesson maps to: Lecture 4 (Neural Networks and Backpropagation).

Attribution (Clawdemy-authored): Stanford CS231n: Deep Learning for Computer Vision, Fei-Fei Li, Ehsan Adeli, and Justin Johnson, Stanford University (cs231n.stanford.edu). CS231n does not publish a required citation string; this is the attribution Clawdemy uses.

The current term’s lecture recordings are posted on Canvas for enrolled Stanford students. Recordings from previous years are publicly available on YouTube under YouTube’s standard license; Clawdemy links out rather than embedding or rehosting. The course notes (cs231n.github.io) and site are Stanford’s. No Creative Commons license is published for the lectures, so we treat them as link-only references.

  • CS231n backpropagation notes. cs231n.github.io/optimization-2 walks the circuit interpretation in detail, including the (x+y)·z example with x=-2, y=5, z=-4 cited in this lesson, plus sigmoid and other gate derivations.
  • CS231n NN-1 notes. cs231n.github.io/neural-networks-1 covers the two-layer architecture, activation functions, and what hidden-layer features look like in practice.
  • AlexNet paper. Krizhevsky, Sutskever, Hinton, “ImageNet Classification with Deep Convolutional Neural Networks” (NeurIPS 2012). The historical inflection that ended the feature-engineering era; widely available as a search for “AlexNet 2012 ImageNet.”
  • Neural Network Intuition (Track 11, Clawdemy). Lessons 8 (“What backpropagation is really doing”) and 9 (“Backpropagation and the chain rule”) cover the same backprop story in a generic neural-network setting with extra step-by-step intuition; T16 readers who want a deeper walk through the chain-rule mechanics will find it there.

Clawdemy follows CS231n’s pedagogical ordering (motivate non-linearity, introduce the two-layer network, then backprop as the chain rule through a graph), and cites CS231n’s exact worked circuit f(x,y,z) = (x+y)·z with the original x=-2, y=5, z=-4 example. The Part B practice exercise (the same circuit with fresh numbers x=3, y=-1, z=2) and the Part A “stacking-linears-collapses” 2-by-2 worked example are Clawdemy-authored against the CS231n framing. We do not reproduce CS231n’s slides, figures, problem sets, or lecture text. Full attribution policy: see Doc/attribution-policy.md.