How few-shot examples teach in context
What you’ll learn
Section titled “What you’ll learn”This is lesson 3 of Phase 5, How we steer models at inference, in Track 5 (AI Foundations). The previous lessons covered decoding strategies (how a token is picked from the model’s distribution) and prompting (how the input shapes the output). This lesson covers a specific prompting move that turned out to be more powerful than anyone designed: in-context learning. The model’s weights are frozen at inference time; you cannot retrain it from inside a prompt. But you can put examples in the prompt, and the model will read those examples, infer the shape of the task, and apply the same pattern to your real query. None of the model’s weights change. The effect lasts only for this one inference call. Researchers building the first large language models did not design this capability; it emerged with scale, and once it was noticed, it became one of the load-bearing reasons LLMs felt useful for new tasks. This lesson covers what zero-shot, one-shot, and few-shot mean, why in-context learning works, when examples help versus when explicit instructions work better, and a practical recipe for getting consistent results out of few-shot prompts. Course materials are at cme295.stanford.edu.
Where this fits
Section titled “Where this fits”This is lesson 3 of Phase 5, How we steer models at inference. The previous lessons in this phase (How text is generated and How prompting works) covered how a model picks tokens and how the input shapes the output. This lesson goes deeper on one specific input technique: putting examples directly in the prompt. The next lesson, How chain of thought makes models think out loud, builds on in-context learning by showing what changes when you also include the reasoning steps in the prompt. After that, Phase 5 is complete and Phase 6 picks up.
Before you start
Section titled “Before you start”Prerequisites: the prompting lesson is required. We assume you understand that the prompt is the model’s context, and that small changes to the prompt can substantially change the output. The decoding lesson is useful but not strictly required.
By the end, you’ll be able to
Section titled “By the end, you’ll be able to”- Define in-context learning and explain why “learning” is an overloaded term in this context
- Distinguish zero-shot, one-shot, and few-shot prompting
- Predict when few-shot will help versus when an explicit instruction will work better
- Build a few-shot prompt that reliably steers a classification or transformation task
- Recognize the limits of in-context learning (cannot create knowledge, cannot rescue out-of-distribution tasks)
Time and difficulty
Section titled “Time and difficulty”- Read time: about 12 minutes
- Practice time: about 12 minutes (a self-check on the zero/one/few distinction, a hands-on few-shot prompt construction exercise on a classification task, and flashcards for retrieval)
- Difficulty: standard