Skip to content

Summary: The handwritten-digit problem

You can read a messy handwritten 3 in an instant, but the moment you try to write down the method, not the answer, the rule, the easy thing turns impossible. That gap between recognizing a digit and explaining the recognition is the reason neural networks exist. This lesson uses the handwritten-digit problem to show why writing rules fails, why this particular problem is the classic place to start, and the one shift that powers almost all of modern AI: stop writing rules, start showing labeled examples. This is the scan-it-in-five-minutes version.

  • To a computer, a digit is not a shape but a pixel grid: a grid of cells where each cell holds a brightness number for how bright it is (0 black, 1 white). A common setup is 28 by 28, which is 784 numbers and nothing more. There are no curves or loops anywhere in the data, only the list of numbers.
  • The same digit produces wildly different numbers each time it is written. Shift it, slant it, or resize it, and every number in the list changes, even though your eye sees the same digit. What is “the same” to you is, in raw numbers, different every time.
  • Rule-writing falls apart fast. Any rule (for example, “a 3 has two stacked bumps on its right side”) fits a tidy textbook digit and misses real ones: slanted, flat-topped, dashed off. Every patch you add invites a new variation you did not anticipate, and the rules pile up without ever closing the gap.
  • The hard part is not the seeing, it is the specifying. Your eyes do the recognizing instantly. What defeats you is putting into exact words what makes a 3 a 3 and not an 8 or a hurried 2.
  • Handwritten digits sit in a useful sweet spot: the input is small and fixed (784 numbers), the output is small (only ten possible answers, 0 through 9), the task is genuinely hard yet clearly solvable (rule-writing fails but a child reads digits effortlessly), and the approach travels (a face, a tumor on a scan, or a photo sorted by its contents are the same shape of problem, numbers in and a label out).
  • The central move is learning from examples: instead of telling the computer what a 3 is, you give it thousands of images, each carrying a label (the known correct answer, “this one is a 3”), and let it find the pattern itself. You stop being the author of the answer and become the curator of the examples.
  • What we are ultimately after is the function we want: a mapping that takes the 784 brightness numbers in and returns ten scores out, one per possible digit, with the highest score being the answer. The twist is that we do not write this function by hand. We let the system build it from the examples, and what is inside it is the work of the lessons that follow.
  • This is why modern AI feels the way it does. Almost every AI tool you have used, the chat assistants, photo search, voice transcription, the spam filter, was built this way: shown enormous numbers of examples, never handed a pile of human-written rules. That single fact explains both its strengths (it is uncannily good at fuzzy, human things we could never have ruled out cleanly) and its blind spots (it can be confidently wrong on an input unlike anything it was shown).
  • Pitfalls worth naming: thinking modern AI is a giant list of hand-written rules (it is the opposite); thinking the hard part is the seeing (it is the specifying); thinking “just write more rules” would eventually work (handwriting variation is endless); and underestimating how far a pile of labeled examples can actually take you (the surprising lesson of the whole field).

Before this lesson, “AI learns from data” was a phrase that had stopped meaning anything. Now it points at something specific: a deliberate decision to stop describing answers and start demonstrating them with examples. When you next watch an AI system do something impressive or something baffling, you have a sharper question to ask, not “how was it programmed?” but “what was it shown?” And as the next lessons in this track crack open that function from 784 numbers to 10, you will already know what it is for and why it has to be learned rather than written. The mental model you just built is the one every later lesson stands on.