Skip to content

Summary: Run a model in a few lines

Running a transformer takes about two lines: name a task, call pipeline(task), and you get labels and scores back. That one call hides three steps, and the rest of the lesson opens them up using the Auto classes. Step one is the tokenizer (AutoTokenizer.from_pretrained), which turns text into input_ids and an attention_mask. Step two is the model: AutoModel gives raw hidden states, while AutoModelFor<Task> adds a head that produces task output as logits. Step three is postprocessing: a softmax turns logits into probabilities, and model.config.id2label attaches the labels. The whole library runs on one idiom, SomeAutoClass.from_pretrained(checkpoint), so you change models by changing a string. This is the scan version; the lesson runs every line.

  • pipeline(task) is the fast path. It picks a default model, downloads it, and handles pre- and postprocessing. Reach for it first; it is correct for exploring and a lot of production work.
  • A pipeline is three steps. Preprocessing (tokenizer), the model forward pass, and postprocessing. Knowing them is what lets you customize.
  • The tokenizer turns text into numbers. AutoTokenizer.from_pretrained(checkpoint) returns input_ids (integer token IDs) and an attention_mask (real tokens versus padding). padding, truncation, and return_tensors="pt" shape the batch.
  • AutoModel gives hidden states; AutoModelFor<Task> adds a head. The base transformer outputs a vector per token; the head projects those down to task output. The class name (...ForSequenceClassification, ...ForTokenClassification, ...ForCausalLM) names the task.
  • Models output logits, not probabilities. Apply a softmax to normalize, and read model.config.id2label to find which column is which label.
  • One idiom loads everything. SomeAutoClass.from_pretrained(checkpoint). Change the checkpoint to change the model; change the head class to change the task. That uniformity is why the ecosystem scales across hundreds of architectures.

You now have both gears. For most jobs you will stay in top gear with pipeline(), and you should: it is less code and less to get wrong. But the moment a task needs something the one-liner does not give you (a specific model, raw scores for a custom threshold, your own batching, a head the pipeline does not expose) you can drop into the Auto classes and run the three steps yourself, because they are no longer a black box. Every later lesson in this track lives at that lower level: fine-tuning adjusts the model in step two, the tokenizer lessons open up step one, and the task lessons swap heads. The pipeline was the elevator; you now know where the stairs are, which is what makes the rest of the building reachable.

The pipeline is two lines because the library hid three steps inside it. Learn the three steps and you stop being limited to what the two lines allow.