Skip to content

Cheatsheet: How agent loops work

An agent is a tool-using LLM that loops.
Without iteration, you have a single tool call.
With iteration + reasoning between iterations, you have an agent.
USER GOAL
→ OBSERVE: read current state (goal, tool responses, prior reasoning)
PLAN: decide next step. Is goal met? If not, which tool with which args?
ACT: take the step (often a function call)
[Loop until OBSERVE concludes goal is met, OR max-iter cap fires]
FINAL RESPONSE

Naming variants: ReAct paper uses think-observe-act. Other papers use different orderings. Names vary; the shape (read → decide → act → iterate) is constant.

USER: "My teddy bear is cold. Please do something."
ITER 1:
Observe: bear is cold. Need to know temperature.
Plan: call get_current_room_temperature()
Act: tool returns {"temperature_f": 65, "thermostat_state": "off"}
ITER 2:
Observe: 65°F is cold. Thermostat off. Need to turn heat up.
Plan: call set_thermostat(target_f=70)
Act: tool returns {"temperature_f_target": 70, "status": "heating"}
ITER 3:
Observe: heat is on, target is 70. Goal essentially met.
Plan: compose final response.
Act: return natural-language answer.
OUTPUT: "I've set the thermostat to 70°F and the heat is on..."

Three iterations. Two tool calls. One coherent goal pursued across multiple steps.

Per-step reliability Reliability over N steps
95% (0.95) 5 steps: 77%
10 steps: 60%
20 steps: 36%
99% (0.99) 5 steps: 95%
10 steps: 90%
20 steps: 82%

The dominant constraint on long-horizon agents. Per-step model improvements compound dramatically: a 95→99% step-reliability gain takes a 10-step task from 60% to 90% total reliability.

USER ←→ AGENT_1 (temperature)
↕ (A2A protocol)
AGENT_2 (energy)
AGENT_3 (security)

Google’s Agent-to-Agent (A2A) protocol (2025) is one early standard. It defines how agents expose: skills, examples, request/response/cancel methods.

Specifics evolving. Framing (standardize how agents talk) is durable.

ThreatWhat can happenRemediation class
Data exfiltrationAgent with user data + outbound tool tricked into sending data to attackerTraining-stage + inference-stage
Prompt injectionUntrusted text (webpage, email) contains instructions overriding the user goalTraining-stage + inference-stage
Tool misuseAgent with destructive tool (delete, send, pay) pushed into using itInference-stage runtime limits
TRAINING-STAGE
→ Safety data in SFT and RLHF mixtures
→ Model is more resistant to adversarial prompts
→ Model more inclined to refuse high-stakes actions
INFERENCE-STAGE
→ Safety classifier monitors conversation
→ Flags or blocks unsafe tool calls before execution
→ Runtime constraints on tools (rate limits, scope, confirmations)

Both required for production agents. Neither is optional.

Failure modeWhat it looks like
Cumulative errorLong-horizon tasks fail because per-step error compounds
Goal driftLoop loses track of original goal as intermediate steps surface tangents
Loop divergenceAgent that’s not making progress keeps looping (until cap fires)
LatencyEach iteration = at least 1 LLM call + 1 tool call. Multi-step tasks take time.

First question: how many steps does it run, and what is each step?

  • Few steps + sequential: probably a useful tool-using feature, marketing the “agent” label loosely
  • Many steps + branching based on outputs: real agent in the strict sense

Second question: what is the worst this thing can do if instructed maliciously, and what stops it from doing that?

  • Tool access scope
  • Inference-stage runtime limits
  • Worst-case tool capability
PitfallReality
”Calling everything an AI agent.”The strict definition is loop + reasoning + tool use. Many “agents” are LLM-plus-prompt with one or two tool calls.
”Underestimating cumulative error.”The error multiplier is real and dominant. 5% per-step error over 10 steps is 40% total failure rate.
”Treating agent safety as an afterthought.”Tool access = capability for misuse. Training-stage + inference-stage remediations are part of the design.
”Assuming an agent will reliably stay on the original goal.”Goal drift is common. Long loops can wander; production agents need explicit goal-anchoring patterns.
  • Agent: tool-using LLM that loops, autonomously pursuing a goal across multiple iterations.
  • Observe-plan-act: canonical agent loop pattern. Naming varies (ReAct: think-observe-act); shape constant.
  • ReAct: Reason + Act. The 2022 paper that introduced the agent-loop pattern. Name-only treatment in this lesson.
  • A2A protocol: Google’s Agent-to-Agent communication standard, released 2025. Specifies how agents expose skills and statuses.
  • Cumulative error: total failure probability over a multi-step agent task. Equals 1 minus product of per-step success rates.
  • Goal drift: agent loses track of the original goal as intermediate steps surface tangential concerns.
  • Data exfiltration: safety threat where an agent is tricked into sending sensitive data to an attacker.
  • Prompt injection: safety threat where untrusted text contains instructions designed to override the user’s goal.
  • Tool misuse: safety threat where an agent with a destructive tool is tricked into using it.

An agent is a tool-using LLM that loops.
Observe what happened. Plan the next step. Act. Repeat until the goal is met.
Cumulative error and safety are the two things that actually limit how far this can go.