What makes an AI an 'agent'

“AI agents will change everything.” You have probably heard the phrase, in a product demo or a colleague’s pitch, and quietly wondered what an agent actually is beyond the marketing. The word is doing a lot of work in a lot of sentences right now, and most of those sentences do not stop to define it.

Here is a definition that holds up. An agent is a model wrapped in a loop: it looks at a goal, decides whether a tool would help, calls the tool, reads the result, and repeats until the goal is met or it gives up. That is the whole idea. Everything else in this track (memory, planning, multi-agent systems, security) is detail layered on top of that one loop. Get the loop solid and the rest of the landscape stops being mysterious.

By the end of this lesson you will be able to define an agent precisely, separate it cleanly from the chatbot you already use, and take apart any agent product into the parts that actually make it work.

The chatbot you already know

Start with what an agent is not.

When you type a question into a chat interface and read the answer, you are using a model in its simplest configuration: text goes in, text comes out, once. You ask, it answers, the exchange ends. There is no loop. The model does not take any action in the world beyond producing words on your screen. It cannot look anything up, run anything, or check anything. It predicts a good response to your text and stops.

This is genuinely useful, and for a huge range of tasks it is all you need. But it has a hard edge, and the edge is easy to find. Ask a plain chatbot “What is the weather in Seattle tomorrow?” and it will tell you, correctly, that it cannot check live data. It has no way to reach outside its own text-prediction. The conversation is a sealed room. The model can only hand you words built from what it already learned during training.

That sealed room is the thing an agent breaks open.

The agent: a perceive-decide-act loop

An agent takes the same underlying model and wraps it in a loop with three repeating moves:

Perceive. Read the current state: the goal, plus whatever results have come back from previous actions.
Decide. Choose what to do next. Often this is “call a specific tool with these inputs.” Sometimes it is “I have enough to answer now.”
Act. Execute the chosen action, observe the result, and feed that result back into the next perceive step.

The loop runs until the goal is met or the agent decides it cannot proceed. That repetition is the entire difference. A single model call is one shot. An agent is the same kind of model called over and over, each time with more information than the last, steering toward a goal.

Worked example: the same question, two systems

Take the weather question and run it through both.

Chatbot. Input: “What is the weather in Seattle tomorrow?” The model predicts a response. The best honest response it can produce is a refusal: it has no live data. Output: “I cannot check live weather.” One step, done.

Agent. Same input, but now the model sits inside a loop and has been told it can call a get-weather tool.

PERCEIVE: goal is "weather in Seattle tomorrow"; no results yet.
DECIDE:   I cannot answer from memory. I have a weather tool. Call it.
ACT:      get_weather(city="Seattle", day="tomorrow")
          -> tool returns: { high: 58F, low: 47F, condition: "rain" }
PERCEIVE: goal is the same; I now have a tool result.
DECIDE:   I have what I need. Compose the answer.
ACT:      reply: "Tomorrow in Seattle: rain, high 58F, low 47F."

Same input. Completely different process. The agent did not know the weather any more than the chatbot did. It knew it could go get the weather, and the loop let it.

What actually makes it “agentic”

Here is the part that trips people up. The model inside an agent is not a special “agent model.” It is broadly the same kind of language model, still doing the same thing it always does: predicting text. Swap in a stronger model and the agent gets better decisions, but its structure does not change. So if the intelligence is just the same old model, where does the “agency” come from?

It comes from the scaffolding around the model. Take apart any agent and you find four parts:

The model. The decision-maker. Given the current state as text, it predicts the next move.
The system prompt. Instructions that tell the model what it is doing and, crucially, that it is allowed to call tools, plus how to format a tool call.
The tools. Functions the model can invoke: read a file, query a database, send an email, search the web. Each tool is a door from the sealed room to the outside world.
The loop. Plain code on the outside that reads the model’s output, notices when it contains a tool call, actually runs that tool, and feeds the result back into the model for the next round.

Worked example: taking a coding agent apart

A coding assistant is a clean specimen because its parts are easy to name. Products like Claude Code, Cursor, and GitHub Copilot are agents in exactly this sense: each reads your code, decides what to change, proposes edits, runs tests, and reacts to the results. The track’s practice exercises run in Clawless, an operating system for AI built on the same loop-with-tools shape applied to everyday AI work beyond coding.

Model:         a general language model, unchanged from the chat version
System prompt: "You are a coding assistant. You can call read_file,
               write_file, and run_tests. Format calls as ..."
Tools:         read_file(path), write_file(path, contents), run_tests()
Loop:          outside code that executes each tool call and returns
               the output (file contents, test pass/fail) to the model

Notice what happens if you swap the model for a different one. The system prompt, the tools, and the loop are untouched. The agent still has the same shape and the same powers. What changes is the quality of its decisions: a stronger model writes better edits and misreads test output less often. The agency lives in the scaffolding; the intelligence lives in the model. Keeping those two ideas separate is the single most useful habit for reasoning about agents.

Tracing the loop on a real task

The weather example only needed one tool call. The interesting tasks need many. Watch the loop carry a multi-step goal: “Book a 30-minute meeting with Sarah sometime next week.”

PERCEIVE: goal is to book a 30-min meeting with Sarah next week.
DECIDE:   I need Sarah's availability. Use the calendar tool.
ACT:      check_calendar(person="Sarah", range="next week")
          -> free: Tue 2:00-3:00pm, Thu 10:00-11:00am
PERCEIVE: Sarah is free Tue afternoon and Thu morning.
DECIDE:   I should check my own calendar for a clash.
ACT:      check_calendar(person="me", range="next week")
          -> busy: Thu 10:00-10:30am
PERCEIVE: Thu morning partly conflicts; Tue 2pm is clear for both.
DECIDE:   Propose Tue 2:00-2:30pm. Send the invite.
ACT:      send_invite(with="Sarah", start="Tue 2:00pm", mins=30)
          -> invite sent and accepted
PERCEIVE: invite is confirmed.
DECIDE:   Goal met. Tell the user.
ACT:      reply: "Booked: 30 min with Sarah, Tuesday 2:00pm."

Four trips around the loop, two different tools, one self-correction when a conflict showed up. Each pass turned on a decision the model made: call the calendar because Sarah’s availability was not yet known, check the user’s own calendar because a conflict might exist, propose a clear slot once both were available. No single step is clever. The power is in the repetition: each pass starts from a richer picture than the last, and the agent keeps going until the goal is actually done, not just described.

Agents did not start with language models

It is worth knowing that the idea of an agent is much older than the current wave. Berkeley’s LLM Agents course covers this history in Lecture 2, where Shunyu Yao (OpenAI; well-known LLM-agents researcher behind ReAct and Tree of Thoughts) gives the lecture. Long before language models, AI already had agents that perceived, decided, and acted in some form. Rule-based conversational programs date to the 1960s (ELIZA). Symbolic planners that searched for a sequence of actions to reach a goal date to the 1970s (STRIPS). Reactive robotics that tied perception directly to action came in the 1980s (Rodney Brooks’ subsumption architecture).

What changed with language models is the “decide” step. For decades, deciding what to do next meant hand-coded rules or an explicit search through possible action sequences, both of which are brittle and expensive to build. A language model replaces that with decisions drawn from everything it learned in training. The agent’s decision-making went from logic you had to write by hand to language the model generates on the fly. That shift is why agents suddenly work on messy, open-ended tasks that older approaches could not touch.

When an agent is the right tool

A loop with tools is more powerful than a single call, and also more expensive, slower, and harder to make reliable. So the honest question is not “could this be an agent” but “should it be.” An agent earns its complexity when a task is open-ended (the steps cannot be fully scripted in advance), multi-step (it needs several tool calls across several turns, not one lookup), or improvable (you want it to use feedback and results to do better as it goes). A task that is none of those, a single fixed lookup or a one-shot generation, is usually better served by a plain model call. Reaching for an agent when a single call would do is the most common way teams make a simple problem slow and flaky.

Common pitfalls

Thinking the model “is” the agent. The model is one of four parts. Strip away the tools and the loop and you are back to a chatbot. The agent is the whole assembly, not the model alone.
Believing an “agent model” exists. There is no separate species of model that is agentic. The same model is agentic when it runs inside a loop with tools and is not when it does not. Agency is a property of the system, not the model.
Confusing more autonomy with more intelligence. An agent that takes many actions is not smarter than one that takes few; it just has a longer loop and more tools. Capability and autonomy are different axes.
Assuming every task wants an agent. The loop costs latency, money, and reliability. For a single fixed step, a plain model call is faster and more dependable. Match the tool to the task.

What you should remember

An agent is a model wrapped in a perceive-decide-act loop with tools. It looks at a goal, decides whether a tool helps, acts, observes, and repeats until done.
The loop is the whole difference from a chatbot. A chatbot is one shot, text in and text out, sealed off from the world. An agent reaches outside through tools and iterates.
Four parts make a system agentic: the model (decides), the system prompt (grants tool use), the tools (reach the world), and the loop (runs tools and feeds results back). The agency is in the scaffolding; the intelligence is in the model.
The idea is old; the “decide” step is what changed. Language models replaced hand-coded rules and explicit search with decisions generated from training, which is why agents now handle open-ended tasks.
Agents earn their cost on open-ended, multi-step, or improvable tasks. For a single fixed step, a plain model call is the better tool.

The next lesson opens up the loop and looks closely at the move that powers it: how tool use specifically turns a model into an agent. We will trace a model emitting a tool call, reading the result, and choosing the next step, so the mechanism behind every example here becomes concrete.