From a chatbot to an agent that gets things done.
Everything so far has been about the model itself — a brilliant next-word predictor. But on its own it can only ever do one thing: read some text and write some text back. An agent is what you build around the model so it can actually act — use tools, take several steps, and keep going until a job is finished.
On its own, an LLM just talks
Hand the raw model a request and it produces one reply, then stops. It can't open a website, run a calculation, look up today's prices, send an email, or remember what you did last week. It doesn't do things — it describes them. Ask it to "book me a table" and the best it can manage is "Sure — here's how you might book one," because writing that sentence is all the machinery allows.
An agent gives it hands
An agent wraps the model in a simple but powerful loop. You hand it a goal; then, over and over, it runs three moves:
Decide the next move
The model looks at the goal and everything so far, and picks one action to take next.
Use a tool
It calls something real — a web search, a calculator, code, your calendar — instead of just writing about it.
Read the result
The tool's answer is fed back into the prompt, so the model can see what happened and choose again.
...and the loop runs again, the model now one step wiser, until the goal is met. Notice what's actually doing the thinking: it's still the same frozen model from every earlier lesson, predicting text. The agent just keeps feeding it the latest results and letting it choose the next step. The intelligence is the LLM; the getting things done is the loop and the tools around it.
One goal, many steps, real tools
Goal: find a well-reviewed Italian place open tonight and put dinner on the calendar. Press the button and watch the agent loop — think, use a tool, read the result, and go again.
A plain chatbot would stop after the first reply — it could only tell you which places to try. The agent keeps looping and actually books one.
LLM vs agent, side by side
A mind that can only speak
One turn in, one turn out. Text only — no actions. Stuck with its frozen knowledge. Forgets everything once the chat ends.
That mind, with hands and a to-do list
Many turns, on its own. Uses tools and takes real actions. Can fetch live, specific information. Keeps notes as it works, and pushes on until the goal is done.
So why not just use the LLM?
Because most useful jobs aren't "write me a paragraph" — they're "go and accomplish this," which takes steps, fresh information, and actions in the real world. A lone model can't reach any of that. A few everyday examples:
"Summarise my unread emails"
Needs to actually open your inbox (a tool) and read it, then write the summary.
"Cheapest flight on Friday?"
Needs a live search right now — not the stale snapshot frozen into its training.
"Fix this failing test"
Read the code, run the test, edit, run it again — a loop, not a one-shot answer.
Each one is impossible for a model that can only write a single reply, and natural for one wrapped in a loop with the right tools.
And that's the whole core picture — from a single next-word guess to an assistant that can act on your behalf. Two optional extras come next for the curious — how a model goes beyond text, and how it answers from your own documents — and then a glossary to keep.