The main NLP tasks: cheatsheet

The shared loop (every task)

Only the head, the label shape, and the metric change between tasks.

Task	Head	Metric	Shape
Sequence classification	`AutoModelForSequenceClassification`	accuracy, F1	encoder
Token classification (NER)	`AutoModelForTokenClassification`	seqeval (entity F1)	encoder
Extractive QA	`AutoModelForQuestionAnswering`	SQuAD EM, F1	encoder
Masked LM	`AutoModelForMaskedLM`	perplexity	encoder
Causal LM	`AutoModelForCausalLM`	perplexity	decoder
Summarization	`AutoModelForSeq2SeqLM`	ROUGE	encoder-decoder
Translation	`AutoModelForSeq2SeqLM`	BLEU / SacreBLEU	encoder-decoder

Collator	Used for
`DataCollatorWithPadding`	Sequence classification (dynamic padding)
`DataCollatorForTokenClassification`	Token classification (pads labels too)
`DataCollatorForLanguageModeling`	Masked LM (and causal LM with `mlm=False`)
`DataCollatorForSeq2Seq`	Summarization, translation

Token classification and QA work in token positions, but labels/answers live at word/character level. Use the fast tokenizer:

This is why fast tokenizers matter (lesson 6).

Summarization and translation need:

The text is its own supervision; DataCollatorForLanguageModeling builds the targets:

Token classification: one label per token (NER, POS tagging).
Extractive QA: return a span of the source; vs generative QA, which writes a new answer (and can hallucinate).
Perplexity: a language-model quality metric; lower is better.
seqeval / ROUGE / BLEU / SQuAD: standard metrics for NER / summarization / translation / QA.

Hugging Face LLM Course, Chapter 7: “Main NLP tasks.” huggingface.co/learn/llm-course/chapter7. Released under Apache 2.0; this lesson mirrors its structure with original prose.