Skip to content

How chain of thought makes models think out loud

This is the closing lesson of Phase 5, How we steer models at inference, in Track 5 (AI Foundations). The previous lessons established how text comes out (decoding strategies), how the prompt shapes it (prompting), and how examples cue capabilities (in-context learning). This lesson covers the move that consistently lifts performance on multi-step problems: chain-of-thought prompting, or CoT. The technique is to ask the model to produce a reasoning path before its final answer. Two flavors: zero-shot CoT (append a phrase like “Let’s think step by step” with no examples) and few-shot CoT (show examples that include the reasoning, not just the final answer). The lesson covers why CoT works (decomposition into subproblems plus more-tokens-equals-more-compute), self-consistency (sample many CoT chains and majority-vote), where CoT helps versus where it’s overkill or actively misleading, and how the technique sets up Phase 6’s reasoning models. Course materials are at cme295.stanford.edu.

This is the closer of Phase 5, How we steer models at inference. The previous lesson (How few-shot examples teach in context) covered the in-context-learning vocabulary and the format-versus-rule heuristic. This lesson combines that vocabulary with a new ingredient (reasoning steps) and shows what it unlocks. Phase 6, How models reason and act, picks up where this lesson stops. Reasoning models bake CoT into the policy itself rather than relying on the prompt; RAG lets a model fetch text it doesn’t have in its weights; function calling lets it emit structured calls; agent loops chain tools together. The shift is from “steer one inference call” to “let the model think longer, look things up, or take actions.”

Prerequisites: the in-context-learning lesson is required. We assume you understand zero-shot and few-shot prompting and the format-versus-rule heuristic. The prompting lesson is also useful but not strictly required.

  • Define chain-of-thought prompting and distinguish zero-shot CoT from few-shot CoT
  • Explain the two reasons CoT works (decomposition into tractable subproblems and more-tokens-equals-more-compute)
  • Apply CoT to a multi-step problem and recognize when CoT is or is not the right tool
  • Describe self-consistency and the cost-versus-accuracy trade-off it makes
  • Distinguish CoT prompting (technique) from reasoning models (architectural shift covered in Phase 6)
  • Read time: about 12 minutes
  • Practice time: about 12 minutes (a self-check on the two reasons CoT works, a hands-on exercise comparing direct prompts to CoT prompts on a multi-step word problem, and flashcards)
  • Difficulty: standard