References: What a generative model is, and the four-paradigm map
Source material
Section titled “Source material”Source curricula (multi-source structural mirror; cited as further study):
PRIMARY (this lesson follows its framing most directly)• Stanford CS236, "Deep Generative Models", Lecture 1: Introduction Instructor: Stefano Ermon Course URL: https://deepgenerativemodels.github.io/ Syllabus: https://deepgenerativemodels.github.io/syllabus.html License: standard course-page link-out; cited as further study
SECONDARY (also contributed to this lesson's framing)• Berkeley CS294-158, "Deep Unsupervised Learning" (Spring 2024), Lecture 1: Introduction Instructors: Pieter Abbeel, Wilson Yan, Kevin Frans, Philipp Wu Course URL: https://sites.google.com/view/berkeley-cs294-158-sp24/ License: standard course-page link-out; cited as further study
Clawdemy's lessons are original prose that follows the pedagogical arc of thesetwo courses, anchored on CS236's lecture order with CS294-158 framing pulled inwhere its slide deck and recording are stronger. We do not reproduce ortranscribe the lectures; we cite them as the recommended companions. All rightsto the original course materials remain with the respective instructors andinstitutions.Watch this next
Section titled “Watch this next”-
Stanford CS236 (Stefano Ermon), course homepage. The primary source for this track. The course homepage links the syllabus, lecture videos, and the course notes; Lecture 1 (Introduction) covers the four-paradigm framing this lesson mirrors. The course notes at deepgenerativemodels.github.io/notes are a written companion that is especially useful when you want a paragraph-level treatment of a concept that flew past in the lecture.
-
Berkeley CS294-158 Sp24 (Pieter Abbeel et al.), course homepage. The secondary source. Lecture 1’s intro slide deck and recording give a complementary placement of the same paradigms, and the rest of the lecture list (autoregressive in L2, flows in L3, latent variable / VAEs in L4, GANs in L5, diffusion in L6) is the cleanest one-lecture-per-paradigm sequence anywhere.
Going deeper
Section titled “Going deeper”A short, durable list. Each link is a specific next step, not a generic pile.
-
“What are Diffusion Models?” by Lilian Weng (OpenAI). A long, careful blog post that walks the full math of diffusion models from the noising process through the reverse-time SDE. Best read after lesson 12 of this track, but a one-pager skim now is a useful preview of the diffusion paradigm.
-
Canonical paper per paradigm. Each paradigm has a single paper that almost every later work cites: PixelRNN by van den Oord et al. 2016 (autoregressive), Kingma and Welling 2013 (VAE), Goodfellow et al. 2014 (GAN), and Ho, Jain, and Abbeel 2020 (DDPM, modern diffusion). The CS236 syllabus and lecture-1 slide deck list each with the arXiv link; reading the abstracts is a fast way to feel the original framing of each paradigm before the textbooks tidied them up.
Adjacent topics
Section titled “Adjacent topics”Where this leads inside this track.
-
Autoregressive models, factoring by the chain rule (lesson 2). This lesson named “predict the next piece, one at a time” as paradigm 1 and stopped there. Lesson 2 opens it up: the chain rule of probability, the next-token prediction objective, and the architecture moves that make modern language models tractable at long context.
-
The four-paradigm landscape, and where modern systems sit (lesson 15). This lesson’s map is the spine of the whole track. Lesson 15 returns to it at the end, with the full math of every paradigm filled in, and places Stable Diffusion, modern image generators, and autoregressive LLMs on the map explicitly.