References: Fine-tuning LLMs

Source material

Source curriculum (structural mirror, cited as further study):
• Hugging Face, "LLM Course", Chapter 11: "Supervised Fine-Tuning"
  Authors: the Hugging Face team (Lewis Tunstall, Leandro von Werra,
    Lysandre Debut, Sylvain Gugger, Merve Noyan, and others)
  Course page: https://huggingface.co/learn/llm-course/chapter11
  Code and notebooks: https://github.com/huggingface/course
  License: Apache 2.0 (prose and code)
  Required attribution: "Based on the Hugging Face LLM Course
    (huggingface.co/learn/llm-course), © Hugging Face, used under the
    Apache 2.0 license. This is an independent structural mirror;
    Hugging Face does not endorse it."
This lesson mirrors the structure of the course's Supervised Fine-Tuning
chapter (chat templates, SFT with the TRL SFTTrainer, and LoRA). Clawdemy's
lessons are original prose that follows the pedagogical arc of the course.
We do not reproduce or transcribe the course; we cite it as the recommended
companion. Course materials are used under the Apache 2.0 license with the
attribution above, which requires a link to the license and an indication
of changes, and does not permit implying endorsement.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

The TRL library documentation. The library behind SFTTrainer. Its SFT guide covers data formats, packing, and configuration in full, the canonical reference once you start a real run.
The PEFT library documentation. Where LoRA and the other parameter-efficient methods live, with LoraConfig examples. Read it to understand the memory savings concretely before fine-tuning a large model.
Chat templating in Transformers. The docs on apply_chat_template and how role-tagged messages become the exact string a model expects. The reference for getting the format right.

Adjacent topics

Where this connects inside the track.

Fine-tune a pretrained model (lesson 3). The contrast lesson: that was task fine-tuning (a classifier head); this is supervised fine-tuning (the generative model learning to follow instructions). The Trainer loop carries over to SFTTrainer.
Run a model in a few lines (lesson 2). Before SFT, the first move is always to try prompting an existing instruction-tuned model, which is exactly the pipeline usage from lesson 2.
Curating high-quality datasets (lesson 11). SFT is only as good as its data; the next lesson is about building and curating the high-quality instruction data that makes fine-tuning work.

References: Fine-tuning LLMs

Source material

Read this next

Going deeper

Adjacent topics