References: LLM foundations for production

Source material

Source curriculum (structural mirror, cited as further study):
• Full Stack Deep Learning, "LLM Bootcamp" (Spring 2023):
    LLM Foundations
  Instructors: Charles Frye, Sergey Karayev, and Josh Tobin
  Course page: https://fullstackdeeplearning.com/llm-bootcamp/
  Lecture videos: publicly available on the Full Stack Deep Learning
    YouTube channel
  License: bootcamp materials are published free to view but no explicit
    license (Creative Commons or otherwise) is published; lecture videos
    are on YouTube under standard terms.
  Required attribution: "Based on the structure of the Full Stack Deep
    Learning LLM Bootcamp (Spring 2023), by Charles Frye, Sergey Karayev,
    and Josh Tobin (fullstackdeeplearning.com/llm-bootcamp). This is an
    independent structural mirror in original prose; it reproduces no
    course materials, and Full Stack Deep Learning does not endorse it."
This lesson mirrors the structure of the corresponding bootcamp session (foundations for builders).
Clawdemy's lessons are original prose that follows the pedagogical arc of
the bootcamp. Because the source publishes no explicit license, we cite
it as a recommended companion and reproduce none of its materials.

Watch this next

Full Stack Deep Learning, LLM Bootcamp: LLM Foundations by Charles Frye, Sergey Karayev, and Josh Tobin. The session this lesson mirrors. The recorded version walks the same builder-level picture with the bootcamp’s own examples and goes a little further on architecture intuition.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

The Anthropic Claude API documentation, “Messages” reference. The reference for the API call shape this lesson assumes (system + messages + max_tokens), with the input/output token accounting that the cost section here references. The fastest way to verify the limits and pricing of whichever Claude model you call.
The Anthropic models and pricing page. The canonical reference for current models, their context lengths, and per-token pricing. The numbers in the lesson’s back-of-envelope assume something like this; substitute your provider’s page when you do the math for a real app.
Track 14, “Run a model in a few lines” (lesson 2) in this repository. The library-side companion: same generation idea, approached through Hugging Face pipeline() and the Auto classes (tokenizer, model, logits, softmax) rather than through a hosted API.

Adjacent topics

Where this connects inside the track and the wider curriculum.

Launch an LLM app in one hour (lesson 1). This lesson is the foundations under the minimum app you shipped. The five components carry forward; this lesson adds the constraints they live under.
Prompt engineering, “Learn to Spell” (lesson 3). The next lesson is the highest-leverage way to spend tokens better, the prompt-engineering toolkit, which is the first deliberate move against the constraints named here.
Track 14 lesson 6 (Tokenizers up close) and Track 15 lesson 2 (Counting the cost). The build-side companions: the tokenizer mechanics that explain why tokens are the unit of cost, and the FLOP/memory accounting that explains why latency is what it is at inference time.