References: Debug your training and get unstuck

Source material

Source curriculum (structural mirror, cited as further study):
• Hugging Face, "LLM Course", Chapter 8: "How to ask for help"
  Authors: the Hugging Face team (Lewis Tunstall, Leandro von Werra,
    Lysandre Debut, Sylvain Gugger, Merve Noyan, and others)
  Course page: https://huggingface.co/learn/llm-course/chapter8
  Code and notebooks: https://github.com/huggingface/course
  License: Apache 2.0 (prose and code)
  Required attribution: "Based on the Hugging Face LLM Course
    (huggingface.co/learn/llm-course), © Hugging Face, used under the
    Apache 2.0 license. This is an independent structural mirror;
    Hugging Face does not endorse it."
This lesson mirrors the structure of Chapter 8 (what to do when you get an
error, debugging the training pipeline, asking on the forums, and writing a
good issue). Clawdemy's lessons are original prose that follows the
pedagogical arc of the course. We do not reproduce or transcribe the
course; we cite it as the recommended companion. Course materials are used
under the Apache 2.0 license with the attribution above, which requires a
link to the license and an indication of changes, and does not permit
implying endorsement.

Read this next

Hugging Face LLM Course, Chapter 8: How to ask for help. The chapter this lesson mirrors. It walks two debugging stories in full (a pipeline that will not load and a forward pass that errors), then has dedicated sections on debugging the training pipeline, asking good forum questions, and writing a good issue, worth reading in full because the worked examples make the method stick.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

The Hugging Face forums. Where to ask “how do I” and “why does this happen” questions. Reading a few well-answered threads is the fastest way to learn what a good question looks like.
How to create a Minimal, Reproducible Example (Stack Overflow). The canonical guide to the single most valuable debugging-and-asking skill. Short, and applies to every language and library.
The transformers GitHub issues. Where real bugs get reported and fixed. Skimming open and closed issues shows both how to write one and whether your problem is already known.

Adjacent topics

Where this connects inside the track.

Fine-tune a pretrained model (lesson 3). The training pipeline you debug here is the Trainer loop from that lesson; the common failure points map onto its data, collator, and model steps.
The main NLP tasks (lesson 7). Token-level tasks fail often at the alignment step, exactly the kind of shape and label error this lesson teaches you to diagnose.
Build and share a demo (lesson 9). Phase 3 opens by shipping a model in a Gradio app; the debugging mindset here carries straight into getting a demo to actually run.