Skip to content

Cheatsheet: Seeing it whole, and where next

LessonThe piece it added
1A network is a function: 784 numbers in, 10 out. Stop writing rules, show examples.
2Built from layers of neurons; a neuron holds one number 0-1 (its activation).
3Each neuron: weighted sum of inputs, plus a bias, through a squish.
4The whole network is one function with ~13,000 knobs (weights + biases).
5”Learning” = make the cost (a wrongness score) small.
6Picture the cost as a landscape; the negative gradient points downhill.
7Gradient descent: step downhill, repeat.
8Backprop: desires propagate backward; one sweep gives the whole gradient.
9That sweep is the chain rule, run backward through the layers.
forward pass → cost → backward pass (backprop) → update (gradient descent) → repeat

One pass over the whole training set = one epoch. Training runs many epochs.

messy "3" image (784 numbers)
→ forward pass → output [0.1, 0.05, 0.0, 0.2, 0.5, ...] (tallest is "4": wrong)
→ cost ≈ 0.90 (high; desired was 1 at the "3" slot)
→ backprop → every knob's downhill nudge
→ update → all ~13,000 knobs step a hair downhill
→ network is slightly less wrong on this image
→ repeat across thousands of images, many epochs → it learns

A row of dials, and a landscape behind them. Where the dials sit = where you stand; your height = how wrong you are. Training = feel downhill, turn every dial a hair that way, repeat until you settle in a low valley. Forward pass reads your height; backprop feels the slope; gradient descent takes the step.

TopicWhere
Convolutional nets, transformersTrack 5 (AI Foundations) covers transformers; future CV track
Smarter optimizers (momentum, Adam)further study
Regularization, dropout, batch normfurther study
Fine-tuning, transfer learningfurther study
Building it in real codeTrack 13
  • Build it yourself → Track 13 (Build Neural Networks from Scratch). This track’s gradient descent and backprop, written as Python you run.
  • Understand modern LLMs → Track 5 (AI Foundations). It covers transformers; a transformer is a neural network, so this foundation carries straight over.
  • Use AI to build things → Track 20 (AI Agents and Tool Use). Agents built on top of trained networks.

A neural network is a row of dials and a landscape behind them; training is a patient walk downhill. You can now picture it, reason about it, and know where to look next.