Four-paradigm landscape, in brief

What you’ll learn

This is the capstone of Track 19. Lesson 1 opened with a map: four paradigms of generative modeling, one-line descriptions of each, and a promise that every modern system you have heard of would fit into one of them. Thirteen lessons later, you have built each paradigm from its foundations. This lesson returns to the map and fills in everything that was promised: every paradigm’s training objective, sampling procedure, and trade-off profile derived from the intervening derivations; a walked-through placement of widely-discussed modern systems (autoregressive language models, Stable-Diffusion-style latent diffusion, GAN-based face generators, video diffusion models, multimodal hybrids); and the procedure for reading any new generative-AI release with paradigm fluency. By the end you will be able to read a paper, model card, or blog post about a new model, identify which paradigm it sits in, predict its trade-offs, and place it on the four-paradigm map without needing to read between the lines. The capability is paradigm fluency, and it is the central deliverable of the track. The primary source is the synthesis material across Stanford CS236; Berkeley CS294-158 Sp24 provides secondary framing.

Where this fits

This is lesson 15 of 15 and the closing lesson of the track. After this lesson, every model you have read about across the track has a precise mathematical home in one of four buckets. The lesson does not introduce new mathematical content; it organizes everything you have built so it stays organized after the course ends. The track returns explicitly to its starting point (the four-paradigm map from lesson 1) and closes on the same image, with the math underneath now filled in.

Before you start

Prerequisites: all of lessons 1 through 14, ideally read recently enough that the paradigm-specific math is fresh. This lesson is heavy on synthesis and light on new derivation, so it will read well if you have a working memory of the four-paradigm vocabulary and the math each paradigm uses. If any of the previous lessons feels stale, a quick re-read of its summary and cheatsheet before working through this lesson is worth the time.

About the math

This lesson is the only lesson in the track with no new math. The work is synthesis: take the thirteen lessons that built each paradigm from its foundations and assemble them into a coherent map that places every modern system you can read about. Worked examples in this lesson are placements of specific modern systems on the map, not new equations. The capstone capability is the procedure for reading any new release fluently, not a new technical result.

By the end, you’ll be able to

Recall the four paradigms of generative modeling (autoregressive, latent-variable, adversarial, score-based / diffusion) and state each one’s training objective and sampling procedure
Place a modern system (autoregressive language model, Stable-Diffusion-style latent diffusion, GAN-based face generator, video diffusion model, multimodal hybrid) on the four-paradigm map by reading its training objective and sampling procedure
Predict a system’s primary trade-offs (sampling speed, likelihood evaluation, sample quality, controllability) from its paradigm
Connect the modeling components in a hybrid system (for example, latent diffusion uses a VAE for compression and a diffusion model in the latent space) by recognizing each component’s paradigm
Apply paradigm fluency to read any new generative-AI system release by identifying training objective, sampling procedure, and trade-offs, then placing the system on the four-paradigm map

Time and difficulty

Read time: about 15 minutes
Practice time: about 16 minutes (a six-question self-check on the four paradigms and their trade-offs, a hands-on placement of five real model releases on the four-paradigm map, an identify-the-components exercise on three hybrid systems, and capstone flashcards)
Difficulty: standard (the capstone of a math-heavy track; no new derivations, but the synthesis depth assumes all prior lesson content is fresh). The reward for the synthesis is paradigm fluency, the capability the track exists to build.