Cheatsheet: Agents that self-check: metacognition
The one idea
Section titled “The one idea”Metacognition is an agent thinking about its own thinking: a deliberate reflection step where it reviews its answer or plan before committing, and revises if the review finds a problem.
Why it works
Section titled “Why it works”A first draft is usually good but flawed (missed case, wrong assumption, a step too fast). A critical re-read catches a real fraction of those flaws before they ship. Generating reaches for a plausible answer; reviewing looks for what is wrong with the answer already on the page. Same model, different stance, run in sequence.
Worked example
Section titled “Worked example”Illustrative numbers, used to show the reflection step in action.
TASK: cheapest flight NYC -> Tokyo next month.WITHOUT reflection: "$612 via one stop." (thin)WITH reflection: "I optimized for price only; confirm bookable, note layover." -> re-checks: $612 has a 14h layover; surfaces $680 with 2h; presents both.The first answer was not wrong, just thin. The self-check caught that before the user saw it.
You have seen this before
Section titled “You have seen this before”Reflection is the general form of self-correction that appeared in pieces:
| Lesson | Trigger | Self-correction |
|---|---|---|
| L2 (tool use) | Tool call failed | read error, retry |
| L6 (agentic RAG) | Weak retrieval | judge insufficient, re-search |
| L7 (planning) | Step contradicts plan | replan |
| L9 (this lesson) | (none needed) | proactive check-my-work step |
L2/L6/L7 reacted to an external signal. Reflection is proactive: the agent interrogates its own output before any signal says it must.
The cheap reliability move
Section titled “The cheap reliability move”| Add a reviewer agent | Add a reflection step |
|---|---|
| Another agent + a handoff + latency (L8 coordination cost) | One extra reasoning pass, no new coordination |
Try reflection before adding a second agent. Add the agent only when reflection is genuinely not enough.
The honest limit
Section titled “The honest limit”A second look is not a guarantee.
- A model confidently wrong on the first pass can rubber-stamp the same error on review.
- Reflection catches what a critical re-read catches (thin answers, missed cases), not what the model is blind to.
- Diminishing returns: one good pass does most of the work; more passes mostly burn tokens.
- Pairs best with a real external signal (run the code, check the source, read the tool result), which brings in information the model did not already have.
Pitfalls to dodge
Section titled “Pitfalls to dodge”- Trusting reflection to catch everything (it lowers the error rate, does not zero it).
- Reflecting endlessly (one good pass; the rest is cost).
- Reaching for a second agent first (reflection is cheaper).
- Skipping a real external signal when one is available.
- Confusing fluent self-justification with checking (look for what is wrong, not reasons it is fine).
Words to use precisely
Section titled “Words to use precisely”- Metacognition / reflection: an agent reviewing its own output or plan before committing.
- Self-correction: adjusting after noticing something is wrong (reactive in L2/L6/L7, proactive in reflection).
- External signal: information from outside the model (a run result, a source) that a self-check can verify against.