Tokens — How LLMs Work

The building blocks

First, your words get chopped into pieces.

A computer can't read letters the way we do. So the very first thing that happens is your text gets split into small chunks called tokens. A token is usually a whole word, but long or unusual words get broken into parts.

Think of LEGO. The model doesn't see a finished sentence — it sees a box of standard bricks (tokens) it has met millions of times before. Every sentence in the world is built from the same set of bricks.

Try it yourself

Type anything — watch it split into tokens

Each coloured chip is one token. Notice how short common words stay whole, but long words split apart.

0 tokens 0 characters

Simplified illustration. Real models learn their exact split from data — but the idea is the same: text becomes a list of tokens. Rough rule of thumb: expect a few more tokens than words — about 130 tokens for every 100 words — because longer words get split into two or three pieces.

Why this matters to you

Two practical reasons:

Cost & limits

AI tools are measured and priced by tokens, not words. Longer prompts and answers cost more and take longer.

Its memory

A model can only "see" what fits in its context window — its short-term memory, measured in tokens. Go past it and the earliest text is forgotten.

Next: once your text is a list of tokens, how does a machine get any sense of what they mean?