Skip to content

References: Fine-tuning LLMs

Source curriculum (structural mirror, cited as further study):
• Hugging Face, "LLM Course", Chapter 11: "Supervised Fine-Tuning"
Authors: the Hugging Face team (Lewis Tunstall, Leandro von Werra,
Lysandre Debut, Sylvain Gugger, Merve Noyan, and others)
Course page: https://huggingface.co/learn/llm-course/chapter11
Code and notebooks: https://github.com/huggingface/course
License: Apache 2.0 (prose and code)
Required attribution: "Based on the Hugging Face LLM Course
(huggingface.co/learn/llm-course), © Hugging Face, used under the
Apache 2.0 license. This is an independent structural mirror;
Hugging Face does not endorse it."
This lesson mirrors the structure of the course's Supervised Fine-Tuning
chapter (chat templates, SFT with the TRL SFTTrainer, and LoRA). Clawdemy's
lessons are original prose that follows the pedagogical arc of the course.
We do not reproduce or transcribe the course; we cite it as the recommended
companion. Course materials are used under the Apache 2.0 license with the
attribution above, which requires a link to the license and an indication
of changes, and does not permit implying endorsement.

A short, durable list. Each link is a specific next step, not a generic pile.

  • The TRL library documentation. The library behind SFTTrainer. Its SFT guide covers data formats, packing, and configuration in full, the canonical reference once you start a real run.

  • The PEFT library documentation. Where LoRA and the other parameter-efficient methods live, with LoraConfig examples. Read it to understand the memory savings concretely before fine-tuning a large model.

  • Chat templating in Transformers. The docs on apply_chat_template and how role-tagged messages become the exact string a model expects. The reference for getting the format right.

Where this connects inside the track.

  • Fine-tune a pretrained model (lesson 3). The contrast lesson: that was task fine-tuning (a classifier head); this is supervised fine-tuning (the generative model learning to follow instructions). The Trainer loop carries over to SFTTrainer.

  • Run a model in a few lines (lesson 2). Before SFT, the first move is always to try prompting an existing instruction-tuned model, which is exactly the pipeline usage from lesson 2.

  • Curating high-quality datasets (lesson 11). SFT is only as good as its data; the next lesson is about building and curating the high-quality instruction data that makes fine-tuning work.