At chat time

It's frozen — so how does it use what you tell it?

Training is over; the dials never change again while you chat. Everything the model does for you happens at inference — and it works only from the text right in front of it: a hidden system prompt, the conversation so far, and your latest message.

First, the big picture

It learns once — then it's frozen for every chat after

Nearly every mix-up about AI comes from blurring these two moments. All the learning happens before you ever open the app; from then on the model is sealed, and each conversation only reads it.

Happens once, before launch

Training

Reads a huge slice of the internet
Tunes billions of dials, trillions of times
Months on tens of thousands of GPUs
Costs tens to hundreds of millions

then every dial is locked in place

Happens on every message you send

Chat time

Reads only the text in front of it
The dials never move — it learns nothing new
Answers in seconds, on a sliver of the hardware
Costs cents, not millions

keeps nothing once the chat is closed

This is why "is it learning from me?" has a simple answer: no. You're reading a finished book, not writing one. (The makers may later use saved chats to train a future model — but the version you're talking to never changes.)

Its only memory is the conversation

The model has no memory between messages of its own. So every turn, the app feeds it the entire conversation again, and it re-reads the whole thing to produce the next reply. The amount it can hold at once is the context window. When a chat grows longer than that window, the earliest parts slide off the edge and are simply forgotten — which is why a very long conversation can start to "lose the thread."

The context window

Watch the window slide

The model can only hold so much at once. Add messages and watch the oldest slide out of the window — and get forgotten. Keep going until you ask it to recall the first city.

what the model can see now

In-context learning

Teach it a pattern in the prompt — no training needed

A remarkable trick: drop a couple of examples into your message and it picks up the pattern on the spot, without changing a single dial. Compare:

No examples

With 2 examples

You

Write a tagline for: a water bottle that keeps drinks cold for 24 hours.

Assistant

"Stay Cool: the water bottle that keeps your drinks cold for 24 hours." — fine, but it guessed the style.

You

Earbuds → "Silence, meet sound."
Running shoes → "Every street is a finish line."
Now do: a water bottle that keeps drinks cold for 24 hours.

Assistant

"Cold all day. No excuses." — it copied your punchy, two-beat style from the examples, instantly.

This is why "show an example" is such a powerful prompting move — more on that in the prompting lesson coming up.

Reaching beyond what it was trained on

A frozen model would be stuck in the past and bad at exact facts. So apps bolt on extra abilities — and crucially, these feed information into the prompt rather than changing the model:

Tools

Search, calculator, code

The model can call out to a web search, a calculator, or run code, then read the result back — how it answers about today or does exact math.

Your documents

Retrieval (RAG)

It looks things up: relevant pages from your files or a database are fetched and pasted into the prompt, so it can answer from specific, current sources — and cite them.

Thinking longer

Reasoning models

Newer models first write a hidden "scratchpad" of steps before answering. Spending more effort at answer time helps a lot on hard, multi-step problems.

That's the whole journey: built from layers of dials, trained on the internet, tuned into an assistant, and run on your words. Now for the honest part — where all of this goes wrong.