Summary: Choosing your model and the effort dial

Closes Phase 1 of Track 22. Three current model families: Opus 4.8 (claude-opus-4-8, the current flagship; verbatim from docs: “Anthropic’s most capable model for complex reasoning and agentic coding”; $5 input / $25 output per million tokens), Sonnet 4.6 (claude-sonnet-4-6, best speed-and-intelligence balance, $3 / $15), and Haiku 4.5 (claude-haiku-4-5, fastest near-frontier, $1 / $5). Opus 4.7 (claude-opus-4-7) remains supported as a legacy 4.7 deployment with the same pricing, context, and posture as 4.8. Sonnet is roughly 1.7x cheaper per output token than Opus; Haiku is 5x cheaper than Opus. The default-pick rule: default to Sonnet for production, reach for Opus when the task is genuinely hard (complex reasoning, agentic coding), reach for Haiku when volume matters and the task is light (classification, inner pipeline steps). The common production pattern is mix-and-match (Sonnet or Opus for user-facing summarization, Haiku for inner classifiers). The model-ID convention has two forms: the 4.6 generation and later use dateless IDs that ARE pinned snapshots (claude-opus-4-8, claude-opus-4-7); pre-4.6 models have a date-suffixed canonical ID (claude-haiku-4-5-20251001) and a dateless alias (claude-haiku-4-5); production should prefer the date-suffixed form on pre-4.6 models. The effort parameter (output_config: {effort: "..."} with values low, medium, high, xhigh, max) controls token spend per call, affects ALL tokens (text, tool calls, thinking), and is supported on Opus 4.8 / Mythos / Opus 4.7 / Opus 4.6 / Sonnet 4.6 / Opus 4.5 (not Haiku). xhigh is available on Opus 4.8 and Opus 4.7; max is also on Sonnet 4.6 and Opus 4.6. Adaptive thinking (thinking: {type: "adaptive"}, with effort controlling depth) is the new mode for Opus 4.8 and Opus 4.7 (the only mode there; manual returns 400), Sonnet 4.6, Opus 4.6; manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is the older mode, still supported on Haiku 4.5 and deprecated on Sonnet 4.6 / Opus 4.6, NOT supported on Opus 4.8 or 4.7. Evaluation is the way to actually pick: build a held-out test set, run candidates, pick the cheapest that passes.

Core ideas

Three current families with three price tiers. Opus 4.8 ($5/$25, current flagship), Sonnet 4.6 ($3/$15), Haiku 4.5 ($1/$5). Opus 4.7 ($5/$25) remains supported as a legacy 4.7 deployment with the same pricing and posture as 4.8. Context windows: Opus 4.8 and Opus 4.7 and Sonnet 4.6 are 1M tokens, Haiku 4.5 is 200k. Knowledge cutoffs differ (Opus 4.8 and 4.7 reliable through Jan 2026, Sonnet 4.6 through Aug 2025, Haiku 4.5 through Feb 2025).
Default-pick rule: Sonnet by default, Opus for hard tasks, Haiku for volume-and-light. Mix and match across models inside one application; lesson 8 onward (the agent loop) extends this.
Model IDs: 4.6 generation and later dateless IS pinned (claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6, claude-opus-4-6). Pre-4.6 has date-suffixed canonical (claude-haiku-4-5-20251001) and dateless alias (claude-haiku-4-5); production uses the date-suffixed form on pre-4.6 models for stability.
Effort parameter: output_config: {effort: "low" | "medium" | "high" | "xhigh" | "max"}. Default is high on all surfaces. Affects all tokens (text + tool calls + thinking). Supported: Opus 4.8, Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6, Opus 4.5. xhigh on Opus 4.8 and Opus 4.7; max on Opus 4.8, Mythos, Opus 4.7, Opus 4.6, Sonnet 4.6. NOT supported on Haiku 4.5.
Sane starts: Sonnet 4.6 production default medium; Opus 4.8 (and Opus 4.7, same posture per the docs) production default xhigh for coding/agentic or high for general (the API default on all surfaces). high and “not setting the parameter” are equivalent.
Adaptive thinking is the new mode (thinking: {type: "adaptive"}) on Opus 4.8 and Opus 4.7 (the only mode there; manual returns 400), Opus 4.6, Sonnet 4.6. Effort controls thinking depth.
Manual extended thinking (thinking: {type: "enabled", budget_tokens: N}) is the older mode. Supported on Haiku 4.5; deprecated but functional on Sonnet 4.6 / Opus 4.6; NOT supported on Opus 4.8 or Opus 4.7.
Cost arithmetic is not exotic. 100k daily calls all-Opus is about $1,000/day; the same workload with a Haiku classifier plus Sonnet answer is about $480/day, roughly 52 percent cheaper. Model choice and parameter choice both move the dial materially.
Evaluation is how you pick. Build a held-out test set, run candidate models, pick the cheapest one that passes. Vibes-driven model selection is the most common cost leak in production AI applications.

What changes for you

Before this lesson, you were passing claude-opus-4-8 as the model name because lesson 1 used it. After this lesson, model selection is a deliberate decision with a worked cost shape behind it. The single highest-leverage change this week: pick one production feature, build a 5-prompt eval set for it, run both Sonnet 4.6 and Opus 4.8 (or Opus 4.7 if your deployment is on 4.7) against it (with default effort), and compute the cost-per-call for each. If Sonnet matches Opus on your eval, the swap saves about forty percent on that feature’s token cost forever. If Opus is materially better on a subset, you have a routing question (Opus for hard cases, Sonnet for easy ones) and the mix-and-match pattern lesson 8 onward extends. Phase 2 (lessons 4-7) will give you the tools, MCP, and prompt-caching levers that compound on top of the model-selection choice; Phase 3 the agent loop where model choice and effort compound across steps; Phase 4 the production cost monitoring that surfaces the dollars this lesson’s arithmetic predicts.