You saw how text becomes tokens. But a token is still just an ID number. Now we'll see how each token gets a rich vector representation that captures meaning.
Review: Tokenizer PlaygroundEmbedding Space Visualizer
Each point represents a word's "meaning coordinates" — similar meanings cluster together. Try the word math below!
From Tokens to Meaning Coordinates
Remember tokenization? Text gets split into tokens (word pieces). Now here's the next step: each token gets converted into an embedding — a list of 768 numbers that represent its meaning.
Coordinates locate things
A spot in a room needs 3 numbers: (x, y, z). GPS uses 2: (latitude, longitude). More dimensions = more precision.
Why 768 numbers?
Model designers chose 768 dimensions — enough to capture nuances like "royal", "emotional", "formal", etc. More dimensions = richer meaning representation.
Similar meanings = nearby
"King" and "queen" have similar coordinates because they share meaning (royalty, power). The plot below squashes 768D → 2D so you can see it.
🔑 Key distinction: These 768 numbers are not the model's parameters. The model has billions of parameters (neural network weights) that learned how to generate these embeddings. The embedding is the output — a snapshot of meaning.
Common Questions
Is it for words or sentences?
Whatever you give it. Feed it "king" → one vector. Feed it "The king sat on the throne" → one vector for the whole sentence. The model processes all tokens internally, then combines them into one result.
Do embeddings change with context?
For embedding models (like we use here): No. "Bank" always returns the same 768 numbers.
Inside an LLM during generation: Yes! "Bank" gets different internal representations in "river bank" vs "bank account". That's where attention helps — but that's hidden inside the model.
Where do embeddings come from?
A dedicated embedding model, separate from chat models like GPT-4. When you use nomic-embed-text in Ollama, that's a specialized model trained to place similar concepts near each other.
"king" → tokenizer → embedding model → [768 numbers]
The Magic of Word Math
Because words have coordinates, we can do arithmetic with them!
Try Your Own
Pick any three words and see what the math produces!
Try: tokyo − japan + france = ?
Explore pre-loaded examples:
Try: your name, a city, an emotion...
To add custom words, configure OPENAI_API_KEY or OLLAMA_BASE_URL in your environment.
Similar Words
Click a word to see similar words
How to use:
- • Click any word to see its nearest neighbors (highlighted in the plot)
- • Drag to pan around the space
- • Scroll to zoom in/out
- • Click cluster buttons to highlight related words (Royalty, Geography, Emotions)