References: Training your own LLM

Source material

Source curriculum (structural mirror, cited as further study):
• Full Stack Deep Learning, "LLM Bootcamp" (Spring 2023):
    How to train your own LLM
  Guest instructor: Reza Shabani (Replit), with Charles Frye, Sergey
    Karayev, and Josh Tobin
  Course page: https://fullstackdeeplearning.com/llm-bootcamp/
  Lecture videos: publicly available on the Full Stack Deep Learning
    YouTube channel
  License: bootcamp materials are published free to view but no explicit
    license (Creative Commons or otherwise) is published; lecture videos
    are on YouTube under standard terms.
  Required attribution: "Based on the structure of the Full Stack Deep
    Learning LLM Bootcamp (Spring 2023), by Charles Frye, Sergey Karayev,
    and Josh Tobin (fullstackdeeplearning.com/llm-bootcamp). This is an
    independent structural mirror in original prose; it reproduces no
    course materials, and Full Stack Deep Learning does not endorse it."
This lesson mirrors the structure of the corresponding bootcamp session (training your own LLM as
a production decision). Clawdemy's lessons are original prose taught at
a strictly technical-primer level; training-data policy, alignment debates,
and similar contested topics are out of scope here.

Watch this next

Full Stack Deep Learning, LLM Bootcamp: How to train your own LLM by Reza Shabani (Replit). The session this lesson mirrors. The recorded version brings the production-team perspective with the bootcamp’s own examples.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

The TRL library documentation. The reference for SFTTrainer and DPOTrainer; the canonical Python implementations of the fine-tuning recipes named in this lesson.
Axolotl. The config-driven wrapper over TRL that most production LoRA fine-tunes use. Worth reading the README + a few example configs before your first fine-tune; it saves significant boilerplate.
The Hugging Face PEFT library. Where LoRA + the other parameter-efficient methods live, with practical LoraConfig examples. The bridge between “I want to fine-tune” and “I have a LoRA-trained adapter to deploy.”

Adjacent topics

Where this connects inside the track and the wider curriculum.

What’s next (lesson 8). This lesson is the deep dive on the “fine-tune an open model” point on the build-vs-buy spectrum named there.
Augmented language models (lesson 4). The retrieval/tool-use boundary, the second of the three fine-tuning criteria (“retrieval/tools don’t fix it”).
LLMOps (lesson 7). The held-out evaluation discipline + A/B testing + regression suite are exactly what makes a fine-tune adoption safe rather than silently risky.
Track 14 lesson 10 (Fine-tuning LLMs: supervised and instruction tuning). The using-side companion for the SFT mechanics this lesson assumes.
Track 15 lesson 13 (Post-training: SFT and RLHF). The from-scratch / lab-POV companion for the same pipeline.
Track 15 lesson 12 (Data: filtering, dedup, mixing, synthetic). The data-engineering companion; the synthetic-data + filtering discipline this lesson recommends is built there.