Summary: Agents that self-check: metacognition
Metacognition is an agent thinking about its own thinking: a reflection step where it checks its own work before committing. The previous lesson raised reliability by adding more agents and paying the coordination cost. This lesson raises it with a cheaper move: one agent reviews its own answer or plan, asks whether it is actually right, and revises if not. That single extra reasoning pass buys a lot of reliability per unit of effort. This summary is the scan-in-five-minutes version of the full lesson.
Core ideas
Section titled “Core ideas”- Reflection works for the same reason editing works. A first draft is usually good but flawed (a missed case, a wrong assumption, a step taken too fast). Reading it back critically before committing catches a real fraction of those flaws. The model that generates and the model that reviews are the same model; the difference is the stance. Generating reaches for a plausible answer; reviewing looks for what is wrong with the answer already on the page.
- Reflection takes a few forms, all sharing one stance (a deliberate pause to look for what is wrong, before committing): critiquing an answer before sending it, reviewing a plan before executing it, and verifying a checkable result (code, a calculation, structured output) against the criteria a correct result must satisfy. Put the pause where the costly mistakes happen.
- You have seen special cases already. Tool-failure recovery (L2), re-searching after a weak retrieval (L6), and replanning when a step contradicts the plan (L7) were all self-correction triggered by an external signal. Reflection is the proactive, general form: a check-my-work step the agent runs on purpose, even when nothing has visibly failed.
- It is the cheap reliability move. To make an agent more reliable you can add a reviewer agent and pay coordination cost (another agent, a handoff, more latency), or you can have the one agent reflect, which costs a single extra reasoning pass and no new coordination. For a large fraction of reliability problems, reflection is the better trade. Try it before reaching for a second agent.
- A second look is not a guarantee. A model confidently wrong on the first pass can be just as confidently wrong on review, rubber-stamping the same mistake. Reflection catches what a critical re-read catches, not what the model is blind to; you cannot proofread a fact you do not know is wrong.
- Two practical consequences. Reflection has diminishing returns (one good pass does most of the work; more passes mostly burn tokens), and it pairs best with a real external signal (run the code, check the source, read the tool result) that brings in information the model did not already have.
What changes for you
Section titled “What changes for you”Before this lesson, “the agent double-checks itself” sounded like a vague reassurance. Now you can see the mechanism and its price: a reflection step is one extra reasoning pass that catches the errors a critical re-read would catch, no more and no less. When you want an agent to be more reliable, you have a clear first move (have it check its own work) and a clear test for when that is not enough (does the error need information or independence the agent does not have?). And you can spot the honest limit the marketing skips: a self-check lowers the error rate, but it does not zero it, and it is strongest when paired with a real signal from the world.