Few-shot prompting: cheatsheet

The one idea that matters

The model is frozen at inference.
Examples in the prompt do not teach it.
They cue patterns it already learned during training.

The vocabulary

Term	What’s in the prompt before your real query
Zero-shot	Nothing. Just the task description and the query.
One-shot	One worked example showing input-output.
Few-shot	Multiple worked examples (typically 3 to 5).

The conceptually important shift is from zero (no examples) to nonzero (some examples). One vs few is mostly convention.

Why few-shot works

Pretrained LLMs absorbed many task patterns during training.
Few-shot examples help the model SELECT which pattern to invoke
and in what format. The model is not learning new facts; it is
being cued into existing capabilities.

When few-shot fails (predictable from the framing)

Failure mode	Why
Task requires unknown facts	Examples cannot manufacture knowledge that’s not in the weights
Examples share an accidental feature (same start word, same topic)	Model locks onto the accidental pattern, not the intended one
Task is far outside training distribution	A few demonstrations cannot bridge a gap of millions of missing pretraining samples

The practical recipe

1. Use 3 to 5 examples by default.
2. Make examples diverse (cover the range of outputs).
3. Make examples representative (real-query length and style).
4. Format consistently (the model will mimic the format).
5. Vary irrelevant dimensions (avoid accidental patterns).
6. Place the real query last (recent context weighs more).

Format-versus-rule heuristic

If examples are conveying…	Reach for…
A format (output shape, label set, phrasing)	Few-shot. Cheap and reliable.
A rule the model has to infer	An explicit instruction.
Multi-step reasoning	An instruction plus chain-of-thought (next lesson).
Both rule + format	Hybrid: instruction first, then 1 to 2 illustrative examples.

When examples beat instructions

Format consistency is the point (JSON shape, one-word label, specific structure).
The categories or labels are unfamiliar enough that an instruction would have to define them anyway.
The model is older or smaller and has weaker reasoning capability.

When instructions beat examples

The rule is concise and explicit (a definition, a step-by-step procedure).
The model is reasoning-capable and can follow written rules.
You want the model to generalize to inputs that look different from the examples.

A worked few-shot example

Tag the email as bug-report, feature-request, account-issue,
or general-question.

Email: "When I click submit, the page hangs."
Tag: bug-report

Email: "Could you add CSV export?"
Tag: feature-request

Email: "I cannot update my billing email."
Tag: account-issue

Email: "Do you offer a free trial?"
Tag: general-question

Email: <YOUR REAL QUERY HERE>
Tag:

Each category appears once. Format is identical. Real query goes last.

Pitfalls to dodge

Pitfall	Reality
”Few-shot teaches the model.”	No. Same model, same weights, different context.
”More examples are always better.”	No. Past 5, diminishing returns. Past 10, can confuse or overfit.
”If zero-shot is unreliable, dump 20 examples.”	The first 3 do most of the work. After that, focus on diversity, not count.
”If few-shot doesn’t work, the model can’t do the task.”	Sometimes the issue is that the rule is hard to infer; an instruction may rescue it.

Glossary

In-context learning (ICL): using the prompt to shape immediate model behavior. No weights change. Effect is local to one inference call.
Zero-shot: prompting with no demonstrations. Just the task description and the query.
One-shot: prompting with one input-output demonstration before the query.
Few-shot: prompting with multiple demonstrations before the query (typically 3 to 5).
Hybrid prompt: an explicit instruction plus a small number of illustrative examples. Often beats pure-instruction or pure-example versions on hard tasks.
Plan-and-Solve Prompting: a 2023 technique showing instructions can outperform pure few-shot on multi-step reasoning. Cited in references.

The model is frozen. The prompt is not.
Examples in the prompt cue patterns the model already knows.
Zero-shot when the task is clear, few-shot when zero-shot is unreliable, instructions when the rule is hard to infer.