References: Training your own LLM
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• Full Stack Deep Learning, "LLM Bootcamp" (Spring 2023): How to train your own LLM Guest instructor: Reza Shabani (Replit), with Charles Frye, Sergey Karayev, and Josh Tobin Course page: https://fullstackdeeplearning.com/llm-bootcamp/ Lecture videos: publicly available on the Full Stack Deep Learning YouTube channel License: bootcamp materials are published free to view but no explicit license (Creative Commons or otherwise) is published; lecture videos are on YouTube under standard terms. Required attribution: "Based on the structure of the Full Stack Deep Learning LLM Bootcamp (Spring 2023), by Charles Frye, Sergey Karayev, and Josh Tobin (fullstackdeeplearning.com/llm-bootcamp). This is an independent structural mirror in original prose; it reproduces no course materials, and Full Stack Deep Learning does not endorse it."This lesson mirrors the structure of the corresponding bootcamp session (training your own LLM asa production decision). Clawdemy's lessons are original prose taught ata strictly technical-primer level; training-data policy, alignment debates,and similar contested topics are out of scope here.Watch this next
Section titled “Watch this next”- Full Stack Deep Learning, LLM Bootcamp: How to train your own LLM by Reza Shabani (Replit). The session this lesson mirrors. The recorded version brings the production-team perspective with the bootcamp’s own examples.
Going deeper
Section titled “Going deeper”A short, durable list. Each link is a specific next step, not a generic pile.
-
The TRL library documentation. The reference for
SFTTrainerandDPOTrainer; the canonical Python implementations of the fine-tuning recipes named in this lesson. -
Axolotl. The config-driven wrapper over TRL that most production LoRA fine-tunes use. Worth reading the README + a few example configs before your first fine-tune; it saves significant boilerplate.
-
The Hugging Face PEFT library. Where LoRA + the other parameter-efficient methods live, with practical
LoraConfigexamples. The bridge between “I want to fine-tune” and “I have a LoRA-trained adapter to deploy.”
Adjacent topics
Section titled “Adjacent topics”Where this connects inside the track and the wider curriculum.
-
What’s next (lesson 8). This lesson is the deep dive on the “fine-tune an open model” point on the build-vs-buy spectrum named there.
-
Augmented language models (lesson 4). The retrieval/tool-use boundary, the second of the three fine-tuning criteria (“retrieval/tools don’t fix it”).
-
LLMOps (lesson 7). The held-out evaluation discipline + A/B testing + regression suite are exactly what makes a fine-tune adoption safe rather than silently risky.
-
Track 14 lesson 10 (Fine-tuning LLMs: supervised and instruction tuning). The using-side companion for the SFT mechanics this lesson assumes.
-
Track 15 lesson 13 (Post-training: SFT and RLHF). The from-scratch / lab-POV companion for the same pipeline.
-
Track 15 lesson 12 (Data: filtering, dedup, mixing, synthetic). The data-engineering companion; the synthetic-data + filtering discipline this lesson recommends is built there.