Skip to content

Cheatsheet: Agents that self-check: metacognition

Metacognition is an agent thinking about its own thinking: a deliberate reflection step where it reviews its answer or plan before committing, and revises if the review finds a problem.

A first draft is usually good but flawed (missed case, wrong assumption, a step too fast). A critical re-read catches a real fraction of those flaws before they ship. Generating reaches for a plausible answer; reviewing looks for what is wrong with the answer already on the page. Same model, different stance, run in sequence.

Illustrative numbers, used to show the reflection step in action.

TASK: cheapest flight NYC -> Tokyo next month.
WITHOUT reflection: "$612 via one stop." (thin)
WITH reflection: "I optimized for price only; confirm bookable, note layover."
-> re-checks: $612 has a 14h layover; surfaces $680 with 2h; presents both.

The first answer was not wrong, just thin. The self-check caught that before the user saw it.

Reflection is the general form of self-correction that appeared in pieces:

LessonTriggerSelf-correction
L2 (tool use)Tool call failedread error, retry
L6 (agentic RAG)Weak retrievaljudge insufficient, re-search
L7 (planning)Step contradicts planreplan
L9 (this lesson)(none needed)proactive check-my-work step

L2/L6/L7 reacted to an external signal. Reflection is proactive: the agent interrogates its own output before any signal says it must.

Add a reviewer agentAdd a reflection step
Another agent + a handoff + latency (L8 coordination cost)One extra reasoning pass, no new coordination

Try reflection before adding a second agent. Add the agent only when reflection is genuinely not enough.

A second look is not a guarantee.

  • A model confidently wrong on the first pass can rubber-stamp the same error on review.
  • Reflection catches what a critical re-read catches (thin answers, missed cases), not what the model is blind to.
  • Diminishing returns: one good pass does most of the work; more passes mostly burn tokens.
  • Pairs best with a real external signal (run the code, check the source, read the tool result), which brings in information the model did not already have.
  • Trusting reflection to catch everything (it lowers the error rate, does not zero it).
  • Reflecting endlessly (one good pass; the rest is cost).
  • Reaching for a second agent first (reflection is cheaper).
  • Skipping a real external signal when one is available.
  • Confusing fluent self-justification with checking (look for what is wrong, not reasons it is fine).
  • Metacognition / reflection: an agent reviewing its own output or plan before committing.
  • Self-correction: adjusting after noticing something is wrong (reactive in L2/L6/L7, proactive in reflection).
  • External signal: information from outside the model (a run result, a source) that a self-check can verify against.