You've seen how LLMs process text using attention. But how do they generate new text? And why does the same prompt sometimes give different answers?
Review: Attention Mechanism VisualizerTemperature Playground
Adjust how the AI places its bets on the next word
When predicting the next word, AI models calculate betting odds for every possible word:
After "The quarterly report shows..."
Temperature 0
Always bet on the favorite. "Growth" wins every time. Consistent but predictable.
Temperature 0.5
Mostly bet on favorites, but occasionally take a chance. Usually "growth," sometimes "revenue." Balanced variety.
Temperature 1.0
Take more chances on underdogs. "Losses" gets a real shot. More surprising - sometimes brilliant, sometimes odd.
Key Insight
The same prompt at different temperatures reveals that LLMs don't 'retrieve' answers - they're placing bets on what word comes next, and temperature adjusts how risky those bets are.
Try it yourself
Select a creative prompt and see how temperature affects the output.
Temperature 0
Deterministic
Temperature 0.5
Balanced
Temperature 1
Creative
Reflection Questions
When would you use temperature 0 vs higher values in your work?
Think about different tasks in your field and what qualities matter most in the output.
Why might the same prompt give different answers?
Think about what the temperature parameter is actually doing to word selection.
How does this change your understanding of AI 'knowing' things?
Consider what it means that the same question can produce different answers.