Cheatsheet: The handwritten-digit problem
The one idea that matters
Section titled “The one idea that matters”A handwritten 3: effortless for you to read, brutally hard to write as rules.
The fix is not a better rule. It is a different idea: stop writing rules → start showing labeled examplesYou do not describe the answer. You demonstrate it, then let the system find the pattern.
What a digit looks like to a computer
Section titled “What a digit looks like to a computer”| To you | To a computer |
|---|---|
| A shape with curves and loops | A grid of pixels, each a brightness number |
| One thing, instantly recognized | A list of numbers (often 784, for a 28x28 image) |
| The same whether shifted or slanted | A different list of numbers every time it moves |
There are no curves or loops in the numbers. Only the numbers.
Why rule-writing fails
Section titled “Why rule-writing fails”- Any rule you write (for example, “a 3 has two stacked bumps on the right”) fits tidy digits and misses real ones: slanted, flat-topped, dashed-off.
- Every patch you add creates a new edge case you did not anticipate.
- Real handwriting has endless variation; a finite list of rules never closes the gap.
- The hard part is not seeing the digit. It is specifying, in exact words, what makes it that digit.
Why handwritten digits are the right first problem
Section titled “Why handwritten digits are the right first problem”| Property | Why it helps |
|---|---|
| Small fixed input | Every image is 784 numbers; small enough to reason about |
| Small output | Only 10 possible answers (digits 0 through 9) |
| Hard but solvable | Defeats rule-writing, yet humans do it effortlessly |
| Generalizes | Same shape as faces, scans, photo sorting: numbers in, label out |
The shift, side by side
Section titled “The shift, side by side”| Rule-based programming | Learning from examples |
|---|---|
| Human writes the logic for every case | Human provides labeled examples |
| Breaks on the first unanticipated case | Improves as it sees more examples |
| You describe the answer | You demonstrate the answer |
What we are after: a function that takes 784 brightness numbers in and gives 10 scores out (one per digit, highest score wins), built from examples rather than written by hand. What is inside that function is lesson 2 onward.
Pitfalls to dodge
Section titled “Pitfalls to dodge”- “Modern AI is a huge list of human-written rules.” Opposite. Nobody wrote the rules; the system learned the pattern from examples.
- “The hard part is the seeing.” The seeing is easy. The specifying is hard.
- “Just write more rules and you will get there.” You will not. Handwriting variation is endless.
- “Examples alone cannot be enough.” The surprising lesson of the field is how far examples alone go.
Words to use precisely
Section titled “Words to use precisely”- Pixel grid: an image stored as brightness numbers, one per cell (for example, 28x28 = 784 numbers).
- Label: the known correct answer attached to a training example (“this image is a 3”).
- Learning from examples: providing labeled data and letting the system find the pattern, instead of hand-writing rules.
- The function we want: a mapping from 784 input numbers to 10 output scores.
The one-line version
Section titled “The one-line version”Modern AI exists because we stopped writing rules and started showing examples.