References: a WaveNet-style hierarchical model
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• Andrej Karpathy, "Neural Networks: Zero to Hero", Lecture 6: "Building makemore Part 5: Building a WaveNet" Creator: Andrej Karpathy Video: https://www.youtube.com/watch?v=t3YJ5hKiMQ0 Code repo (makemore): https://github.com/karpathy/makemore (MIT License) Series repo: https://github.com/karpathy/nn-zero-to-hero (MIT License) Series page: https://karpathy.ai/zero-to-hero.html License: makemore and the series code are MIT-licensed; the video is YouTube standard.This lesson covers Lecture 6, where Karpathy restructures the flat MLP into ahierarchical, WaveNet-style model and reorganizes the code into reusable layermodules. Clawdemy's lessons are original prose following the pedagogical arc ofthis series; we do not reproduce or transcribe the video or code. Thereceptive-field table and the brianna staging example here are ours. All rightsto the original video and code remain with the creator.Watch this next
Section titled “Watch this next”- Building makemore Part 5: Building a WaveNet (Andrej Karpathy) by Andrej Karpathy. The lecture this lesson mirrors. Karpathy reshapes the flat model into the tree, wrestles with getting the tensor shapes right at each level (a practical, instructive struggle), and rebuilds the network out of clean, reusable layer modules. Watching the receptive field grow level by level, and the code turn into a tidy stack of layers, makes both the architectural and the software lessons concrete.
Going deeper
Section titled “Going deeper”-
WaveNet: A Generative Model for Raw Audio (van den Oord et al., 2016) (arXiv). The original DeepMind paper. It introduced the dilated-causal-convolution hierarchy this lesson is based on, and produced the most natural synthetic speech of its time. Worth a skim to see the idea in its first, audio-focused form.
-
makemore on GitHub (MIT License) and the Zero to Hero series. The WaveNet model is the last makemore stage; the next lecture leaves makemore behind and builds a GPT.
Adjacent topics
Section titled “Adjacent topics”Where this sits in the curriculum.
-
The MLP language model (lesson 4). This lesson directly restructures that flat model. The embeddings, the
tanhhidden layer, and the softmax output all carry over; what changes is how the context characters are combined, gradually up a tree instead of all at once. If the “flat fusion” critique felt fast, that lesson is the grounding. -
The transformer (next phase, and the AI Foundations track). The “stack identical refining layers” structure here is exactly the shape of a transformer, which the next phase builds from scratch. The AI Foundations track describes transformers from the user’s side; this track is about to build one. WaveNet’s fixed tree gives way there to attention, a more flexible way for each position to choose what to combine.