References: Fine-tuning LLMs
Source material
Section titled “Source material”Source curriculum (structural mirror, cited as further study):• Hugging Face, "LLM Course", Chapter 11: "Supervised Fine-Tuning" Authors: the Hugging Face team (Lewis Tunstall, Leandro von Werra, Lysandre Debut, Sylvain Gugger, Merve Noyan, and others) Course page: https://huggingface.co/learn/llm-course/chapter11 Code and notebooks: https://github.com/huggingface/course License: Apache 2.0 (prose and code) Required attribution: "Based on the Hugging Face LLM Course (huggingface.co/learn/llm-course), © Hugging Face, used under the Apache 2.0 license. This is an independent structural mirror; Hugging Face does not endorse it."This lesson mirrors the structure of the course's Supervised Fine-Tuningchapter (chat templates, SFT with the TRL SFTTrainer, and LoRA). Clawdemy'slessons are original prose that follows the pedagogical arc of the course.We do not reproduce or transcribe the course; we cite it as the recommendedcompanion. Course materials are used under the Apache 2.0 license with theattribution above, which requires a link to the license and an indicationof changes, and does not permit implying endorsement.Read this next
Section titled “Read this next”- Hugging Face LLM Course, Supervised Fine-Tuning chapter. The chapter this lesson mirrors. It walks the full SFT run with TRL, the chat-templates section, and the LoRA section in detail, the place to go when you have a concrete model and dataset to fine-tune.
Going deeper
Section titled “Going deeper”A short, durable list. Each link is a specific next step, not a generic pile.
-
The TRL library documentation. The library behind
SFTTrainer. Its SFT guide covers data formats, packing, and configuration in full, the canonical reference once you start a real run. -
The PEFT library documentation. Where LoRA and the other parameter-efficient methods live, with
LoraConfigexamples. Read it to understand the memory savings concretely before fine-tuning a large model. -
Chat templating in Transformers. The docs on
apply_chat_templateand how role-tagged messages become the exact string a model expects. The reference for getting the format right.
Adjacent topics
Section titled “Adjacent topics”Where this connects inside the track.
-
Fine-tune a pretrained model (lesson 3). The contrast lesson: that was task fine-tuning (a classifier head); this is supervised fine-tuning (the generative model learning to follow instructions). The
Trainerloop carries over toSFTTrainer. -
Run a model in a few lines (lesson 2). Before SFT, the first move is always to try prompting an existing instruction-tuned model, which is exactly the
pipelineusage from lesson 2. -
Curating high-quality datasets (lesson 11). SFT is only as good as its data; the next lesson is about building and curating the high-quality instruction data that makes fine-tuning work.