Vectors, Embeddings & Similarity

Vectors, Embeddings & Similarity — explained simply for developers.

Learn this interactively →
Basicsconcept

What's the difference between a scalar and a vector?

A scalar is just a single number, like 7 or -2.5. A vector is an array of numbers, like [0.2, -1.4, 3.0] — exactly the JavaScript array you write every day. That's it. When you hear "vector" in AI, picture a plain list of floats. The whole field is just doing arithmetic on these arrays, millions at a time. A scalar holds one value; a vector holds an ordered list of values you can loop over.
#vector#scalar#basics#arrays
Basicsconcept

What is an embedding, really?

An embedding IS a vector — an array of numbers that captures the meaning of something. You feed a word, sentence, or chunk of text into a model, and it hands back an array like [0.12, -0.87, 0.33, ...]. Similar meanings produce similar arrays. So "dog" and "puppy" land near each other; "dog" and "taxes" land far apart. Think of it as turning fuzzy human meaning into a list of numbers a computer can actually compare and search.
#embedding#vector#meaning#basics
Basicsconcept

When people say an embedding has 1536 "dimensions," what does that mean?

Dimensions just means the length of the array. A 1536-dimensional embedding is an array with 1536 numbers in it: [0.01, -0.4, ... ] with 1536 slots. In code terms, embedding.length === 1536. Each slot is one number the model uses to encode some aspect of meaning. More dimensions = more room to capture nuance, but also more numbers to store and compare. Common sizes you'll see are 384, 768, 1536, 3072. Don't overthink it — it's array length.
#dimensions#embedding#vector#basics
Basicsconcept

Why is a vector sometimes called a "direction" or an arrow?

Take a 2-number vector like [3, 4]. You can draw it as an arrow from the origin (0,0) to the point (3,4). The arrow has a direction (which way it points) and a length (how far it reaches). Embeddings live in hundreds of dimensions, so you can't actually draw them, but the same intuition holds: each vector is an arrow pointing somewhere in "meaning space." Two arrows pointing the same way mean similar things — that's the key idea behind similarity.
#vector#direction#intuition#geometry
Basicsgotcha

Is the dot product an array or a single number?

A single number — a scalar. You combine two arrays and collapse them down to one value. Beginners sometimes expect another array out, but no: dot([1,2,3],[4,5,6]) returns 32, not a list. This trips people up because matrix multiplication (which is many dot products) DOES produce arrays. But one dot product alone = one number. Remember the shape: two equal-length arrays in, one number out. That number is your similarity-ish score.
#dot-product#gotcha#scalar#shape
Basicsconcept

Is a vector the same as a JavaScript array? What about a matrix?

Pretty much, yes. A vector is a 1D array of numbers: [0.2, -1.4, 3.0]. A matrix is a 2D array — an array of arrays, like a grid: [[1,2,3],[4,5,6]], which has 2 rows and 3 columns. You'd loop a vector once and a matrix with nested loops, same as any grid you've built. In AI, a batch of embeddings is naturally a matrix: each row is one vector. So the data structures are exactly the ones you already use; the math just gives them meaning.
#vector#matrix#array#basics
Basicsconcept

Do I have to write the embedding numbers myself?

No — you never hand-pick them. You call an embedding model (an API or a local model) with your text, and it returns the array: const vec = await embed("refund policy") gives back something like [0.03, -0.51, ...] with hundreds of slots. The model learned those numbers during training. Your job is just to store them and compare them with dot product or cosine. Think of the embedding model as a function text -> array of floats that you treat as a black box.
#embedding#model#api#basics
Core ideahow-to

What is the magnitude (length) of a vector, in plain terms?

The magnitude is how long the arrow is — the distance from the origin to the vector's point. You compute it like the Pythagorean theorem you learned for triangles: square each number, add them, take the square root. For [3, 4]: sqrt(3*3 + 4*4) = sqrt(9 + 16) = sqrt(25) = 5. So that vector has length 5. In code: Math.sqrt(v.reduce((s,x) => s + x*x, 0)). It's just "how far does this arrow reach."
#magnitude#length#vector#pythagorean
Core ideahow-to

How do you compute a dot product?

Multiply matching positions, then add it all up — a zip-then-sum. For a = [1, 2, 3] and b = [4, 5, 6]: 1*4 + 2*5 + 3*6 = 4 + 10 + 18 = 32. In code it's a one-liner: a.reduce((sum, ai, i) => sum + ai*b[i], 0). Both arrays must be the same length or the loop breaks. The result is a single number (a scalar), not another array. This one little operation is the workhorse behind almost everything in an LLM.
#dot-product#how-to#vector#scalar
Core ideacode

Show me the dot product as code I'd actually write.

It's a single loop: function dot(a, b) { let sum = 0; for (let i = 0; i < a.length; i++) { sum += a[i] * b[i]; } return sum; } Or the functional version: a.reduce((s, x, i) => s + x*b[i], 0). Pass dot([1,2,3],[4,5,6]) and you get 32. Notice you read both arrays at the same index i — that's why they must be equal length. This exact pattern, repeated across many rows, is what a neural-network layer and attention both run.
#dot-product#code#loop#javascript
Core ideaconcept

What does the dot product actually MEAN geometrically?

It tells you how much two arrows point the same way. Big positive dot product = they point in a similar direction. Around zero = they're perpendicular (unrelated). Negative = they point in roughly opposite directions. Quick feel: [1,0] dot [1,0] = 1 (same way), [1,0] dot [0,1] = 0 (perpendicular), [1,0] dot [-1,0] = -1 (opposite). So a high dot product between two embeddings hints they mean similar things — though length sneaks in, which is why cosine often wins.
#dot-product#meaning#direction#intuition
Core ideahow-to

What is cosine similarity and how do you compute it?

Cosine similarity is the dot product divided by both vectors' lengths. The formula: cos = dot(a,b) / (length(a) * length(b)). Dividing out the lengths cancels magnitude, so you're left with pure direction — a score from -1 to 1. For a=[1,0], b=[1,1]: dot = 1, length(a) = 1, length(b) = sqrt(2) ≈ 1.414, so cos ≈ 1/1.414 ≈ 0.707. That 0.707 means "fairly similar direction" (a 45-degree angle). 1 = identical direction, 0 = perpendicular, -1 = opposite.
#cosine#similarity#how-to#angle
Core ideaconcept

Why does cosine similarity range from -1 to 1, and what do the ends mean?

Because you divide out the lengths, cosine only measures the ANGLE between two arrows. Same direction = angle 0 = score 1 (max similarity). Perpendicular = 90 degrees = score 0 (unrelated). Opposite = 180 degrees = score -1 (max dissimilarity). For text embeddings you usually see scores between roughly 0 and 1, since most meanings aren't true opposites. A higher number means "more alike in meaning." It's a clean, size-independent similarity score — that's why search systems love it.
#cosine#similarity#range#angle
Core ideaconcept

If two embeddings have cosine similarity of 0, what does that tell me?

It means the two vectors are perpendicular — they share no common direction, so the model sees their meanings as essentially unrelated. Picture arrows at a 90-degree angle: neither reinforces nor opposes the other. In search terms, a chunk scoring 0 against your query is off-topic and you'd skip it. Scores near 1 are strong matches, scores near 0 are unrelated, and negative scores (rare for text) suggest opposing meaning. So 0 is your "meh, not relevant" signal.
#cosine#similarity#interpretation#angle
Hands-onconcept

Why is cosine similarity THE operation behind semantic search and RAG?

In RAG you embed every document chunk once and store the vectors. At query time you embed the user's question into a vector, then compute cosine similarity between that query vector and every chunk vector — chunks.map(c => cosine(queryVec, c.vec)) — and return the top few highest scores. Those are the chunks closest in MEANING, not just keyword match. You feed them to the LLM as context. So "find relevant docs" becomes "find the vectors pointing the same direction as my question." That one comparison powers the whole retrieval step.
#rag#semantic-search#cosine#retrieval
Hands-onconcept

Why does "closer vectors = more similar meaning" actually hold?

Because the embedding model was trained to put related text near each other in the array space. During training it nudged vectors so that things appearing in similar contexts ended up pointing similar directions. So "refund" and "money back" land close; "refund" and "giraffe" land far. The geometry isn't magic — it's the learned result of optimization. That's why a similarity score on the raw numbers reflects real semantic closeness, and why you can search by meaning instead of exact words.
#embedding#similarity#training#meaning
Hands-ondecision

What's the difference between dot product and cosine similarity, and when do I use each?

Dot product mixes BOTH direction and length, so a longer vector scores higher even if its direction is the same. Cosine divides out the lengths, so it only measures direction — fairer when you only care about meaning, not magnitude. Use cosine for general semantic search where vectors vary in length. Use raw dot product when your vectors are already normalized (all length 1), because then dot product and cosine give the identical answer — and dot is cheaper to compute. Many vector DBs normalize for exactly this reason.
#dot-product#cosine#decision#normalization
Hands-onhow-to

What does it mean to normalize a vector, and how do I do it?

Normalizing means rescaling a vector so its length becomes exactly 1, while keeping its direction. You divide every number by the vector's magnitude. For [3, 4] (length 5): [3/5, 4/5] = [0.6, 0.8], which now has length sqrt(0.36 + 0.64) = sqrt(1) = 1. In code: const m = Math.sqrt(v.reduce((s,x)=>s+x*x,0)); const unit = v.map(x => x/m);. After normalizing, the only thing left is direction — so the dot product of two normalized vectors equals their cosine similarity.
#normalization#unit-vector#how-to#cosine
Hands-oncode

How would I find the most similar chunk to a query, in code?

Score every chunk and pick the best: function cosine(a,b){ const dot = a.reduce((s,x,i)=>s+x*b[i],0); const ma = Math.sqrt(a.reduce((s,x)=>s+x*x,0)); const mb = Math.sqrt(b.reduce((s,x)=>s+x*x,0)); return dot/(ma*mb); } const ranked = chunks .map(c => ({c, score: cosine(queryVec, c.vec)})) .sort((a,b) => b.score - a.score); const top3 = ranked.slice(0,3); That's the heart of a vector search: map to scores, sort descending, take the top. Real vector DBs just do this faster.
#rag#code#cosine#ranking
Gotchasgotcha

Why can't I take the dot product of two different-length arrays?

Because the dot product pairs up matching positions — a[0]*b[0], a[1]*b[1], and so on. If a has 3 numbers and b has 4, position 3 of b has no partner, so the operation is undefined. In code your loop would read undefined and produce NaN. The rule "shapes must line up" shows up everywhere in this math. For embeddings it means you can only compare vectors from the SAME model — a 768-dim vector and a 1536-dim vector simply can't be compared.
#dot-product#shape#gotcha#dimensions
Gotchasgotcha

Why is raw dot product "biased" by length, and why does that matter?

Because dot product grows when either vector is longer, two vectors pointing the same direction can score very differently if one is bigger. Imagine [1,0] dot [1,0] = 1, but [1,0] dot [10,0] = 10 — same direction, ten times the score, purely from length. If chunk lengths vary, raw dot product can rank a longer-but-less-relevant chunk above a shorter perfect match. Cosine fixes this by dividing out length, so you compare meaning fairly. That bias is the whole reason cosine exists.
#dot-product#cosine#bias#gotcha
Gotchasdecision

Once I've normalized all my vectors, can I skip cosine and just use dot product?

Yes — and that's a common speed trick. If every vector already has length 1, then length(a) * length(b) is 1 * 1 = 1, so cosine = dot product divided by 1 = just the dot product. You get the identical similarity score with fewer operations, because you skip computing the two square roots at query time. Many vector databases normalize on insert for exactly this reason. So: normalize once up front, then run cheap dot products forever after.
#normalization#dot-product#cosine#optimization