Context & attention — How LLMs Work

Reading the context

It looks at every other word to understand each one.

Words change meaning depending on their neighbours. The breakthrough that made modern LLMs possible — the Transformer, from 2017 — is built around a mechanism called attention: for every word, the model decides which other words are worth "paying attention to" right now.

Watch the focus shift

What does "it" refer to?

Same sentence, one word changed. Watch how the model's attention for the word "it" jumps to a different earlier word — that's how it knows what "it" means. Click to switch.

The bars show how strongly "it" attends to each word. The model does this for every word at once, hundreds of times over — it's the bulk of the heavy lifting going on inside the model.

Why this was a big deal: earlier AI read sentences strictly left-to-right and had often forgotten the start by the time it reached the end. Attention lets the model look at the whole sentence at once and connect "it" to "animal" even when they're far apart. That single idea is what unlocked today's LLMs.

Tokens, meaning, context — the model has now taken your words in every way it can. Time to open it up and meet the machine that turns all of that into a guess.