Skip to content

References: a WaveNet-style hierarchical model

Source curriculum (structural mirror, cited as further study):
• Andrej Karpathy, "Neural Networks: Zero to Hero", Lecture 6:
"Building makemore Part 5: Building a WaveNet"
Creator: Andrej Karpathy
Video: https://www.youtube.com/watch?v=t3YJ5hKiMQ0
Code repo (makemore): https://github.com/karpathy/makemore (MIT License)
Series repo: https://github.com/karpathy/nn-zero-to-hero (MIT License)
Series page: https://karpathy.ai/zero-to-hero.html
License: makemore and the series code are MIT-licensed; the video is YouTube standard.
This lesson covers Lecture 6, where Karpathy restructures the flat MLP into a
hierarchical, WaveNet-style model and reorganizes the code into reusable layer
modules. Clawdemy's lessons are original prose following the pedagogical arc of
this series; we do not reproduce or transcribe the video or code. The
receptive-field table and the brianna staging example here are ours. All rights
to the original video and code remain with the creator.
  • Building makemore Part 5: Building a WaveNet (Andrej Karpathy) by Andrej Karpathy. The lecture this lesson mirrors. Karpathy reshapes the flat model into the tree, wrestles with getting the tensor shapes right at each level (a practical, instructive struggle), and rebuilds the network out of clean, reusable layer modules. Watching the receptive field grow level by level, and the code turn into a tidy stack of layers, makes both the architectural and the software lessons concrete.

Where this sits in the curriculum.

  • The MLP language model (lesson 4). This lesson directly restructures that flat model. The embeddings, the tanh hidden layer, and the softmax output all carry over; what changes is how the context characters are combined, gradually up a tree instead of all at once. If the “flat fusion” critique felt fast, that lesson is the grounding.

  • The transformer (next phase, and the AI Foundations track). The “stack identical refining layers” structure here is exactly the shape of a transformer, which the next phase builds from scratch. The AI Foundations track describes transformers from the user’s side; this track is about to build one. WaveNet’s fixed tree gives way there to attention, a more flexible way for each position to choose what to combine.