Predicting the next word — How LLMs Work

Making the guess

It doesn't pick one answer. It rolls loaded dice.

After all that processing, the model produces a probability for every possible next token — a giant ranked list of "how likely is each word to come next." Then it picks one. Here's that list, live:

Interactive

"The cat sat on the ___"

These are the model's guesses for the next word, with their probabilities. Drag the slider to change the temperature — the creativity dial — and watch the odds reshape.

Temperature (creativity): 0.8

Low · T = 0.2

Safe and predictable

Almost always picks the top word. Repetitive but reliable — good for facts, code, and structured answers.

High · T = 1.3

Creative and surprising

Gives unlikely words a real chance. Varied, sometimes nonsense — good for brainstorming and stories.

Watch it write

One word at a time, in real time

This is the whole engine running. It picks a word, adds it, then predicts the next from scratch. The bars show its live guesses for the very next word.

Creativity: 0.9

It never drafts an outline first. Each word is chosen as the likely next one — yet whole coherent paragraphs emerge from that single move, repeated.

That's the engine. Next we'll watch all the pieces run together — and then spend the rest of the course on the big question: how did it ever learn which words are likely?