Practice: Seeing the field whole

Self-check

Six short questions. Try to answer each one in your head (or on paper) before opening the collapsible. Active retrieval is where the learning sticks; rereading feels productive but does much less.

1. What three things arrived together to unlock the modern era of deep learning?

Show answer

Depth, data, and compute. The ideas behind neural networks are decades old; what changed was that deep architectures, large datasets, and parallel compute arrived together and made deep networks finally work. Scaling that trio up is most of what “progress” has meant since.

2. State the core of the whole map in one line: what is the single engine, and what are the four problem shapes it gets wired for?

Show answer

One engine: the neural network (layers of neurons and weights, tuned by gradient descent and backpropagation). Four shapes: sequences (recurrence, then attention), images (convolution), generation (VAE/GAN/diffusion), and decisions (reinforcement learning). Same parts, different arrangements.

3. Match each problem shape to how the engine is wired for it: sequences, images, generation, decisions.

Show answer

Sequences → recurrence, then attention (carry or weigh information across positions). Images → convolution (slide small shared filters over local patches). Generation → VAE, GAN, or diffusion (learn the data’s shape, then produce new examples). Decisions → reinforcement learning (act, get rewarded, improve a policy).

4. Name one of the “deep unities” that makes this a field rather than a grab-bag.

Show answer

Any one of: (a) reuse one set of weights across many positions powered both recurrence (the same cell at every time step) and convolution (the same filter at every patch); (b) learn the shape of the data unified all three generative models; (c) underneath everything sits the same training loop: define what “wrong” means, then use gradient descent and backpropagation to make it less wrong.

5. What are the four questions you can ask to place almost any AI system?

Show answer

What shape of problem does it solve? How is the engine wired for it? What was it trained on? And where will it break? Hold those four and almost any deep-learning system becomes legible.

6. Why does the map you learned outlast the specific models in the headlines?

Show answer

Specific models change in months: new names, bigger numbers. But a new model is still one of the four shapes (or a blend), still the same engine trained the same way, still bounded by the same four limits. Names and records age in months; the structure ages in decades.

Try it yourself: place three systems in the four-questions frame

This is the capstone exercise, and it is the load-bearing skill of the whole track. For each system, answer the four questions: (1) what shape of problem, (2) how the engine is wired, (3) what it was trained on, (4) where it tends to break. About 15 minutes.

Side effects: none. This is a thinking-and-writing exercise. No tools, no API calls, no costs.

The systems:

A chat assistant that answers questions and drafts text.
A tool that produces an image from a written description.
A system that learned to play a board game at a superhuman level.

Show a model answer

1. Chat assistant

Shape: sequences (language in, language out).
Wiring: a transformer, using attention to weigh all positions at once.
Trained on: enormous amounts of text.
Where it breaks: confident fabrication (hallucination), and the other limits, brittleness on unusual inputs, the slant of its data, no guarantees.

2. Image-from-text tool

Shape: generation.
Wiring: a diffusion model, producing an image by removing noise step by step, steered by the text prompt.
Trained on: large collections of images (with associated text).
Where it breaks: slow (many denoising steps), varied run to run, and capable of confidently rendering something subtly wrong because it matches the look of data rather than understanding the world.

3. Superhuman board-game system

Shape: decisions.
Wiring: reinforcement learning, an agent improving a policy from rewards.
Trained on: not a fixed labeled dataset but many played games, learning from outcomes.
Where it breaks: superhuman inside its arena, but typically sample-inefficient and brittle, and hard to transplant outside the clean rules it trained in.

If you produced four crisp answers for each, you can place almost any AI system you meet. That is the whole point of the survey.

Flashcards

Ten cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page, ready to print or save as a PDF for offline review.

Q. What unlocked the modern era of deep learning?

Depth, data, and compute arriving together. The neural-network ideas were decades old; the trio is what made deep networks finally work.

Q. What is the single 'engine' underneath all of deep learning?

The neural network: layers of neurons and weights, tuned by gradient descent and backpropagation. Every capability in the track is that one engine, wired differently.

Q. What are the four problem shapes, and how is the engine wired for each?

Sequences (recurrence, then attention), images (convolution), generation (VAE/GAN/diffusion), decisions (reinforcement learning). Same parts, different arrangements.

Q. What single idea unified recurrence and convolution?

Reuse one set of weights across many positions: the same cell at every time step (recurrence), the same filter at every patch (convolution). When a problem has repeated structure, share weights.

Q. What single idea unified VAEs, GANs, and diffusion?

Learn the shape of the data, then produce new examples from it. However different they look on the surface, all three are doing that.

Q. What training loop sits under every architecture in the track?

Define what “wrong” means, then use gradient descent and backpropagation to nudge the weights until it is less wrong. One trainer, one engine, a few recurring moves.

Q. What are the four questions for placing any AI system?

What shape of problem does it solve? How is the engine wired for it? What was it trained on? Where will it break?

Q. Why does the map outlast specific models?

A new model is still one of the four shapes (or a blend), still the same engine trained the same way, still bounded by the same four limits. Names age in months; the structure ages in decades.

Q. In one sentence, what is deep learning?

A single, learnable, pattern-matching engine, scaled with depth, data, and compute, and arranged to fit the shape of the problem, astonishingly capable and bounded in the same breath.

Q. Where do you go next after this survey?

Track 5 (Transformers and LLMs) for language-model depth, Track 13 (Build Neural Networks from Scratch) to build it yourself, the Neural Network Intuition track for the engine itself, and a deeper road for each problem shape.