Skip to content
Lesson 3 · The vector idea

What are vectors and embeddings?

An embedding turns a piece of text into a list of numbers — a point in space — arranged so that things with similar meaning end up close together. That list of numbers is called a vector. Embeddings are how a computer can compare meaning instead of just matching exact words.

Scroll

Turning meaning into coordinates

Computers are great with numbers and clumsy with meaning. An embedding model fixes that: you feed it a word or sentence, and it returns a list of numbers — say a few hundred of them. You can picture that list as coordinates for a point in space. The trick is that the model places points so that similar meanings land near each other: "dog" and "puppy" sit close, "dog" and "tax return" sit far apart.

Like a hash that keeps meaning

Developers know hashing: turn data into a fixed-size value. But a normal hash scatters similar inputs to totally different outputs — that's the point of it. An embedding is the opposite kind of hash: similar inputs get similar outputs. Two sentences that mean nearly the same thing produce two nearly-identical vectors. That "similar in, similar out" property is the whole magic.

The essentials

  • A vector is just an ordered list of numbers. An embedding is a vector that represents meaning.
  • The number of values in the list is its "dimensions" — often a few hundred to a couple thousand.
  • The same embedding model must be used for everything you compare — mixing models is like measuring in inches and centimeters.

You can't see 500 dimensions — and that's fine

We draw embeddings on a flat 2D plane so you can build intuition. Real embeddings live in hundreds of dimensions, but the idea is identical: closeness means similarity. Speaking of closeness — how exactly do we measure how close two points are? That's next.

InteractiveDrag a word. A line snaps to its nearest neighbour — that's 'similar meaning'.
Next: similarity and distance →