Practice: Agents that self-check: metacognition

Self-check

Seven short questions. Answer each in your head before opening the collapsible. Active retrieval is where the learning sticks.

1. What is metacognition, for an agent?

Show answer

An agent thinking about its own thinking: a deliberate reflection step where, instead of acting on its first answer, it pauses and asks whether the answer or plan is actually right, what it might have missed, whether its approach is sound, and then revises if the review finds a problem.

2. Why does a reflection step raise reliability? (The editing analogy.)

Show answer

For the same reason editing improves human writing: a first draft is usually good but flawed (a missed case, a wrong assumption, a step taken too fast). Reading it back with a critical eye before committing catches a real fraction of those flaws. Generating reaches for a plausible answer; reviewing looks for what is wrong with the answer already on the page. They are different stances, and doing them in sequence beats doing only the first.

3. The lesson says reflection is “the general form” of something you have already seen. What?

Show answer

Self-correction. Tool-failure recovery (L2), re-searching after a weak retrieval (L6), and replanning when a step contradicts the plan (L7) were all self-correction triggered by an external signal. Reflection is the proactive, general version: a check-my-work step the agent runs on purpose, even when nothing has visibly failed.

4. Why is reflection called “the cheap reliability move” compared to adding an agent?

Show answer

A reflection step costs one extra reasoning pass and adds no new coordination. Adding a reviewer agent costs another agent to build, a handoff to manage, and more latency (the L8 coordination cost). For a large fraction of reliability problems, reflection is the better trade: cheaper, no seams to lose information across, trivial to add as one more step in the loop.

5. What can reflection reliably catch, and what can it not?

Show answer

It reliably catches the errors a fresh, critical re-read would catch: thin answers, missed cases, skipped checks. It does not reliably catch errors the model cannot see in itself; a model confidently wrong on the first pass can be just as confidently wrong on the review, rubber-stamping the same mistake. You cannot proofread a fact you do not know is wrong.

6. Why does reflection pair well with an external signal?

Show answer

Because an external signal (running the code, verifying against a source, getting a tool result) brings in information the model did not already have. Pure self-review is limited to what the model can see in itself; a real signal from the world catches errors the model is blind to. The strongest self-check combines the model’s own review with an external check.

7. Someone says, “We added five reflection passes, so the agent is now five times more reliable.” What is wrong with that?

Show answer

Reflection has diminishing returns. One good review pass catches most of what review will catch; a third, fourth, and fifth pass mostly burn tokens and time for shrinking gains. And no number of self-review passes catches errors the model is blind to; for those you need an external signal, not more passes.

Try it yourself: reflection, or a second agent?

No tooling, no cost; this is design judgment. The previous lesson’s move for reliability was adding a specialized agent; this lesson’s is a reflection step. For each situation, decide whether a reflection step is enough or whether you genuinely need a second (reviewer) agent, and say why in one line. Then check.

A. An agent drafts customer replies that are occasionally thin or miss a
   caveat the user would want.
B. An agent writes code, and you can actually run the code and read the test
   results.
C. A high-stakes legal-summary agent whose output a domain expert must sign
   off on, in a different specialty from the drafting agent.
D. An agent produces a plan and sometimes ships it with a missing step or bad
   ordering that a careful re-read would catch.

Show answer

A: reflection is enough. Thin answers and missed caveats are exactly what a critical re-read catches. Add a reflection-on-the-answer step before paying for a second agent.
B: reflection plus an external signal (not a second agent). You can run the code, so combine the agent’s self-review with the real test results; the external signal beats pure self-review and needs no new agent.
C: a second agent. This is a genuine different-specialty review with a sign-off requirement; self-review by the drafting agent cannot stand in for an independent expert checker. Pay the coordination cost here.
D: reflection is enough. Reviewing a plan before executing is a named reflection form, and a missing step or bad ordering is what a critical re-read catches, far cheaper than discovering it five steps in.

The deciding question: can a critical re-read by the same agent (optionally plus a real external signal) catch the error, or does the error need information or independence the agent does not have? Reach for the second agent only when reflection genuinely cannot do it.

Try it yourself: where does the reflection step go?

Reflection shows up at different points (critiquing an answer, reviewing a plan before executing, verifying a result against criteria). For each agent below, name where the reflection step should sit and what it should check.

1. An agent that books travel by first producing a multi-step plan.
2. An agent that returns a JSON object a downstream system will parse.
3. An agent that answers support questions in free text.

Show answer

Review the plan before executing. Check for missing steps or bad ordering before acting on any of it, far cheaper than discovering the flaw five steps in.
Verify the result against criteria. Test the JSON against what a correct result must satisfy (valid structure, required fields, sensible values) and fix it if it falls short; this is checkable, so verify rather than just re-read.
Critique the answer. Re-read the drafted reply for errors, gaps, and thin spots before sending it.

The rule: put the pause where the costly mistakes happen. A bad plan is costly mid-execution, so reflect before executing; a malformed structured output breaks a downstream system, so verify it against criteria; a thin free-text answer just ships, so critique before sending.

Flashcards

Ten cards. Click any card to reveal the answer. Use the Print flashcards button to lay out the full set as one card per page for offline review.

Q. What is metacognition, for an agent?

An agent thinking about its own thinking: a deliberate reflection step where it reviews its own answer or plan before committing, and revises if the review finds a problem.

Q. Why does reflection raise reliability?

The same reason editing improves writing: a first draft is good but flawed, and a critical re-read catches a real fraction of the flaws before they ship. Generating and reviewing are different stances; running them in sequence beats running only the first.

Q. Reflection is the general form of which earlier idea?

Self-correction. Tool-failure retry (L2), re-searching a weak retrieval (L6), and replanning (L7) were reactive self-correction. Reflection is the proactive, general version: a check-my-work step run on purpose, even when nothing has failed.

Q. What are the three places reflection shows up?

Critiquing an answer before sending it, reviewing a plan before executing it, and verifying a checkable result (code, a calculation, structured output) against the criteria a correct result must satisfy.

Q. Why is reflection the 'cheap reliability move'?

It costs one extra reasoning pass and no new coordination, versus a reviewer agent’s build cost, handoff, and latency. For many reliability problems it is the better trade: cheaper, no seams, trivial to add.

Q. When should you add a reviewer agent instead of reflecting?

Only when reflection is genuinely not enough: when the check needs information or independence the drafting agent does not have (a different specialty, a required independent sign-off). Try reflection first.

Q. What does reflection reliably catch, and what does it miss?

It catches what a critical re-read catches: thin answers, missed cases, skipped checks. It misses errors the model is blind to; a confidently wrong first pass can be confidently wrong on review.

Q. Why does reflection pair well with an external signal?

An external signal (run the code, check the source, read the tool result) brings in information the model did not already have, catching errors pure self-review cannot see. The strongest self-check combines both.

Q. Does stacking more reflection passes keep improving reliability?

No. Reflection has diminishing returns: one good pass catches most of what review will catch; more passes mostly burn tokens and time.

Q. What is the difference between checking and self-justification?

Checking looks for what is wrong with the answer; self-justification produces a confident-sounding rationale for why the answer is fine. A useful reflection step does the former, not the latter.