First, your words get chopped into pieces.
A computer can't read letters the way we do. So the very first thing that happens is your text gets split into small chunks called tokens. A token is usually a whole word, but long or unusual words get broken into parts.
Type anything — watch it split into tokens
Each coloured chip is one token. Notice how short common words stay whole, but long words split apart.
Simplified illustration. Real models learn their exact split from data — but the idea is the same: text becomes a list of tokens. Rough rule of thumb: expect a few more tokens than words — about 130 tokens for every 100 words — because longer words get split into two or three pieces.
Why this matters to you
Two practical reasons:
Cost & limits
AI tools are measured and priced by tokens, not words. Longer prompts and answers cost more and take longer.
Its memory
A model can only "see" what fits in its context window — its short-term memory, measured in tokens. Go past it and the earliest text is forgotten.
Next: once your text is a list of tokens, how does a machine get any sense of what they mean?