References: How few-shot examples teach in context
Source material
Section titled “Source material”Source material:• Stanford CME 295: Transformers & Large Language Models, Autumn 2025 Instructor: Afshine Amidi & Shervine Amidi, Stanford University Course site: https://cme295.stanford.edu/ Cheatsheet: https://cme295.stanford.edu/cheatsheet/ Source lecture (Lecture 3, Large Language Models): https://www.youtube.com/watch?v=BREr-2cMx-4 License (lecture videos): as published on Stanford's public YouTube channel License (Amidi cheatsheets): MITThis lesson adapts the in-context-learning section of Stanford CME 295Lecture 3, covering [01:14:54] zero-shot vs few-shot framing, [01:15:59]the few-shot teddy-bear walkthrough, and [01:17:08] the format-versus-rulenuance that modern reasoning models surface. Clawdemy provides originalnotes, summaries, and quizzes derived from this material for educationalpurposes. All rights to the original lectures remain with Stanford andthe instructors.Primary source
Section titled “Primary source”- “Language Models are Few-Shot Learners”, Brown et al., 2020. The GPT-3 paper. Section 3 introduces the zero-shot, one-shot, and few-shot vocabulary and reports that few-shot performance scales with model size. This is the paper that turned in-context learning from a curiosity into a load-bearing claim about what large language models can do. Read the introduction and section 3 after this lesson; the rest of the paper is largely benchmark results that may have aged out of current relevance.
Adjacent technique: zero-shot reasoning prompts
Section titled “Adjacent technique: zero-shot reasoning prompts”-
“Large Language Models are Zero-Shot Reasoners”, Kojima et al., 2022. The paper that showed adding “Let’s think step by step” to a prompt improves reasoning performance even with no examples. Often cited as the starting point of the “instructions can be enough” line of research. The next lesson (chain-of-thought) covers this in depth; we list it here because it sits at the boundary between “in-context learning” and “instruction-following.”
-
“Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning”, Wang et al., 2023. A follow-up zero-shot CoT technique: the model first devises a plan, then solves. Improves zero-shot reasoning over Kojima et al.’s “Let’s think step by step.” Listed here for reference; the comparison between instructions and examples is the next lesson’s territory.
Going deeper
Section titled “Going deeper”A short list, chosen for durability.
-
“Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?”, Min et al., 2022. A surprising empirical finding: replacing the labels in few-shot examples with random labels often does not hurt performance much. The paper argues the format and the input-output pairing structure carry more weight than the actual correctness of the labels. Worth reading for a calibration check on what few-shot examples are actually doing.
-
“A Survey on In-Context Learning”, Dong et al., 2024. A field overview of ICL research circa 2024. Useful if you want a wider view of the landscape than this lesson covers.
Adjacent topics
Section titled “Adjacent topics”-
The chain-of-thought lesson (next). This lesson covered demonstrations of input-output pairs. The next lesson covers what happens when you also include the reasoning steps in the demonstration. The combination (chain-of-thought few-shot) is one of the most effective prompting techniques on multi-step problems and the bridge from “show me” to “explain to me, step by step.”
-
Prompt engineering as an applied skill. The recipe in this lesson (3-to-5 examples, diverse, representative, consistent format) is a baseline. The applied prompt-engineering literature has accumulated many task-specific patterns; the OpenAI cookbook and the Anthropic prompt library are practical entry points if you want to see the patterns in production use. Both are public resources.
Stanford CME 295 cheatsheet
Section titled “Stanford CME 295 cheatsheet”- Stanford CME 295 cheatsheet by the Amidi twins. MIT-licensed. The “in-context learning” section covers the same material in their dense visual style. The cheatsheet is more compressed than the lecture and worth using as a study reference after this lesson.
Community discussion
Section titled “Community discussion”None selected for this lesson. The published literature is consolidated enough that academic sources are the better entry point. Durable community references will be added at a future quarterly review if any consolidate.