You've seen how LLMs process text using attention. But how do they generate new text? And why does the same prompt sometimes give different answers?

Review: Attention Mechanism Visualizer

Temperature Playground

Adjust how the AI places its bets on the next word

When predicting the next word, AI models calculate betting odds for every possible word:

After "The quarterly report shows..."

"growth"35% odds"revenue"28% odds"profits"20% odds"losses"12% odds"puppies"0.001% odds

Temperature 0

Always bet on the favorite. "Growth" wins every time. Consistent but predictable.

Temperature 0.5

Mostly bet on favorites, but occasionally take a chance. Usually "growth," sometimes "revenue." Balanced variety.

Temperature 1.0

Take more chances on underdogs. "Losses" gets a real shot. More surprising - sometimes brilliant, sometimes odd.

Key Insight

The same prompt at different temperatures reveals that LLMs don't 'retrieve' answers - they're placing bets on what word comes next, and temperature adjusts how risky those bets are.

Try it yourself

Select a creative prompt and see how temperature affects the output.

Temperature 0

Deterministic

No output yet

Temperature 0.5

Balanced

No output yet

Temperature 1

Creative

No output yet

Reflection Questions

When would you use temperature 0 vs higher values in your work?

Think about different tasks in your field and what qualities matter most in the output.

Why might the same prompt give different answers?

Think about what the temperature parameter is actually doing to word selection.

How does this change your understanding of AI 'knowing' things?

Consider what it means that the same question can produce different answers.