Summary: How agent loops work

An agent is a tool-using LLM that loops. A single function call is just a function call. A model that does multiple function calls in sequence, each chosen by the model based on what came back from the previous, is an agent. The loop is the thing.

Observe, plan, act, repeat. The canonical pattern (from the ReAct paper, named slightly differently across papers) is: read the current state, decide the next concrete step, take it, observe what happened, decide if the goal is met. If not, iterate.

Cumulative error is the dominant constraint. Each step has some failure rate. Five steps at 95% reliable is a 77% reliable agent. Ten steps at 95% reliable is a 60% reliable agent. This is why long-horizon agents are still mostly research, despite the underlying components (function calling, reasoning models) being production-ready.

Safety threads matter more for agents than for chat-only LLMs. When an LLM can take actions, it can take actions that should not have happened. Data exfiltration, prompt injection, and tool misuse are real failure modes. Training-stage and inference-stage remediations are required, not optional.

This summary is the scan-it-in-five-minutes version. The full lesson covers the worked example, the multi-agent and agent-to-agent setting, and the practical safety remediations.

Core ideas

Agent definition. A system that autonomously pursues a goal and completes tasks on a user’s behalf, by running a loop of tool calls and reasoning.
The agentic property is the loop. Without iteration, you have a single tool call (last lesson). With iteration plus reasoning between iterations, you have an agent.
Observe-plan-act. Read state, decide next step, take action, repeat. ReAct paper used think-observe-act; names vary across papers but the shape is constant.
The loop terminates when the observe stage concludes the goal is met, or when an external limit is hit (max-iteration cap, budget cap). Production agents always have a cap.
Worked example (teddy-bear-is-cold). Iteration 1: read temperature. Iteration 2: turn heat up. Iteration 3: report success. Three iterations, two tool calls, one coherent goal pursued across multiple steps.
Multi-agent. Different agents responsible for different domains, communicating with each other. Google’s A2A protocol (2025) is one early standard for how agents expose skills and statuses.
Cumulative error. Each iteration has some failure probability; over many iterations, those compound. The dominant practical constraint on long-horizon agents.
Goal drift. Long-running loops can lose track of the original goal as intermediate steps surface tangential concerns.
Safety threads. Data exfiltration (agent tricked into sending sensitive data), prompt injection (untrusted text instructions overriding the goal), tool misuse (agent using destructive tools maliciously).
Two classes of remediation. Training-stage (safety data in SFT/RLHF mixtures, makes model more resistant) and inference-stage (safety classifier monitors conversation, runtime constraints on what tools can do).
Pitfall: calling everything an agent. The marketing term is overloaded. By the strict definition, only systems that loop with reasoning are agents.
Pitfall: underestimating cumulative error. Multi-step reliability is governed by the multiplier; small per-step error gains compound dramatically in long-horizon work.

What changes for you

After this lesson, “AI agent” stops being a marketing term and becomes a specific shape you can identify. You can read agentic features by counting steps in their workflows (more steps = more cumulative-error exposure), and you can ask the safety question that actually matters: what is the worst this thing can do if instructed maliciously, and what stops it from doing that? Phase 7 will build on this with a more comprehensive safety recap closing the track.

An agent is a tool-using LLM that loops.
Observe what happened. Plan the next step. Act. Repeat until the goal is met.
Cumulative error and safety are the two things that actually limit how far this can go.