Question 1

What's the difference between a scalar and a vector?

Accepted Answer

A scalar is just a single number, like 7 or -2.5. A vector is an array of numbers, like [0.2, -1.4, 3.0] — exactly the JavaScript array you write every day. That's it. When you hear "vector" in AI, picture a plain list of floats. The whole field is just doing arithmetic on these arrays, millions at a time. A scalar holds one value; a vector holds an ordered list of values you can loop over.

Question 2

What is an embedding, really?

Accepted Answer

An embedding IS a vector — an array of numbers that captures the meaning of something. You feed a word, sentence, or chunk of text into a model, and it hands back an array like [0.12, -0.87, 0.33, ...]. Similar meanings produce similar arrays. So "dog" and "puppy" land near each other; "dog" and "taxes" land far apart. Think of it as turning fuzzy human meaning into a list of numbers a computer can actually compare and search.

Question 3

When people say an embedding has 1536 "dimensions," what does that mean?

Accepted Answer

Dimensions just means the length of the array. A 1536-dimensional embedding is an array with 1536 numbers in it: [0.01, -0.4, ... ] with 1536 slots. In code terms, embedding.length === 1536. Each slot is one number the model uses to encode some aspect of meaning. More dimensions = more room to capture nuance, but also more numbers to store and compare. Common sizes you'll see are 384, 768, 1536, 3072. Don't overthink it — it's array length.

Question 4

Why is a vector sometimes called a "direction" or an arrow?

Accepted Answer

Take a 2-number vector like [3, 4]. You can draw it as an arrow from the origin (0,0) to the point (3,4). The arrow has a direction (which way it points) and a length (how far it reaches). Embeddings live in hundreds of dimensions, so you can't actually draw them, but the same intuition holds: each vector is an arrow pointing somewhere in "meaning space." Two arrows pointing the same way mean similar things — that's the key idea behind similarity.

Question 5

Is the dot product an array or a single number?

Accepted Answer

A single number — a scalar. You combine two arrays and collapse them down to one value. Beginners sometimes expect another array out, but no: dot([1,2,3],[4,5,6]) returns 32, not a list. This trips people up because matrix multiplication (which is many dot products) DOES produce arrays. But one dot product alone = one number. Remember the shape: two equal-length arrays in, one number out. That number is your similarity-ish score.

Question 6

Is a vector the same as a JavaScript array? What about a matrix?

Accepted Answer

Pretty much, yes. A vector is a 1D array of numbers: [0.2, -1.4, 3.0]. A matrix is a 2D array — an array of arrays, like a grid: [[1,2,3],[4,5,6]], which has 2 rows and 3 columns. You'd loop a vector once and a matrix with nested loops, same as any grid you've built. In AI, a batch of embeddings is naturally a matrix: each row is one vector. So the data structures are exactly the ones you already use; the math just gives them meaning.

Question 7

Do I have to write the embedding numbers myself?

Accepted Answer

No — you never hand-pick them. You call an embedding model (an API or a local model) with your text, and it returns the array: const vec = await embed("refund policy") gives back something like [0.03, -0.51, ...] with hundreds of slots. The model learned those numbers during training. Your job is just to store them and compare them with dot product or cosine. Think of the embedding model as a function text -> array of floats that you treat as a black box.

Question 8

What is the magnitude (length) of a vector, in plain terms?

Accepted Answer

The magnitude is how long the arrow is — the distance from the origin to the vector's point. You compute it like the Pythagorean theorem you learned for triangles: square each number, add them, take the square root. For [3, 4]: sqrt(3*3 + 4*4) = sqrt(9 + 16) = sqrt(25) = 5. So that vector has length 5. In code: Math.sqrt(v.reduce((s,x) => s + x*x, 0)). It's just "how far does this arrow reach."

Question 9

How do you compute a dot product?

Accepted Answer

Multiply matching positions, then add it all up — a zip-then-sum. For a = [1, 2, 3] and b = [4, 5, 6]: 1*4 + 2*5 + 3*6 = 4 + 10 + 18 = 32. In code it's a one-liner: a.reduce((sum, ai, i) => sum + ai*b[i], 0). Both arrays must be the same length or the loop breaks. The result is a single number (a scalar), not another array. This one little operation is the workhorse behind almost everything in an LLM.

Question 10

Show me the dot product as code I'd actually write.

Accepted Answer

It's a single loop:
function dot(a, b) {
  let sum = 0;
  for (let i = 0; i < a.length; i++) {
    sum += a[i] * b[i];
  }
  return sum;
}
Or the functional version: a.reduce((s, x, i) => s + x*b[i], 0). Pass dot([1,2,3],[4,5,6]) and you get 32. Notice you read both arrays at the same index i — that's why they must be equal length. This exact pattern, repeated across many rows, is what a neural-network layer and attention both run.

Question 11

What does the dot product actually MEAN geometrically?

Accepted Answer

It tells you how much two arrows point the same way. Big positive dot product = they point in a similar direction. Around zero = they're perpendicular (unrelated). Negative = they point in roughly opposite directions. Quick feel: [1,0] dot [1,0] = 1 (same way), [1,0] dot [0,1] = 0 (perpendicular), [1,0] dot [-1,0] = -1 (opposite). So a high dot product between two embeddings hints they mean similar things — though length sneaks in, which is why cosine often wins.

Question 12

What is cosine similarity and how do you compute it?

Accepted Answer

Cosine similarity is the dot product divided by both vectors' lengths. The formula: cos = dot(a,b) / (length(a) * length(b)). Dividing out the lengths cancels magnitude, so you're left with pure direction — a score from -1 to 1. For a=[1,0], b=[1,1]: dot = 1, length(a) = 1, length(b) = sqrt(2) ≈ 1.414, so cos ≈ 1/1.414 ≈ 0.707. That 0.707 means "fairly similar direction" (a 45-degree angle). 1 = identical direction, 0 = perpendicular, -1 = opposite.

Question 13

Why does cosine similarity range from -1 to 1, and what do the ends mean?

Accepted Answer

Because you divide out the lengths, cosine only measures the ANGLE between two arrows. Same direction = angle 0 = score 1 (max similarity). Perpendicular = 90 degrees = score 0 (unrelated). Opposite = 180 degrees = score -1 (max dissimilarity). For text embeddings you usually see scores between roughly 0 and 1, since most meanings aren't true opposites. A higher number means "more alike in meaning." It's a clean, size-independent similarity score — that's why search systems love it.

Question 14

If two embeddings have cosine similarity of 0, what does that tell me?

Accepted Answer

It means the two vectors are perpendicular — they share no common direction, so the model sees their meanings as essentially unrelated. Picture arrows at a 90-degree angle: neither reinforces nor opposes the other. In search terms, a chunk scoring 0 against your query is off-topic and you'd skip it. Scores near 1 are strong matches, scores near 0 are unrelated, and negative scores (rare for text) suggest opposing meaning. So 0 is your "meh, not relevant" signal.

Question 15

Why is cosine similarity THE operation behind semantic search and RAG?

Accepted Answer

In RAG you embed every document chunk once and store the vectors. At query time you embed the user's question into a vector, then compute cosine similarity between that query vector and every chunk vector — chunks.map(c => cosine(queryVec, c.vec)) — and return the top few highest scores. Those are the chunks closest in MEANING, not just keyword match. You feed them to the LLM as context. So "find relevant docs" becomes "find the vectors pointing the same direction as my question." That one comparison powers the whole retrieval step.

Question 16

Why does "closer vectors = more similar meaning" actually hold?

Accepted Answer

Because the embedding model was trained to put related text near each other in the array space. During training it nudged vectors so that things appearing in similar contexts ended up pointing similar directions. So "refund" and "money back" land close; "refund" and "giraffe" land far. The geometry isn't magic — it's the learned result of optimization. That's why a similarity score on the raw numbers reflects real semantic closeness, and why you can search by meaning instead of exact words.

Question 17

What's the difference between dot product and cosine similarity, and when do I use each?

Accepted Answer

Dot product mixes BOTH direction and length, so a longer vector scores higher even if its direction is the same. Cosine divides out the lengths, so it only measures direction — fairer when you only care about meaning, not magnitude. Use cosine for general semantic search where vectors vary in length. Use raw dot product when your vectors are already normalized (all length 1), because then dot product and cosine give the identical answer — and dot is cheaper to compute. Many vector DBs normalize for exactly this reason.

Question 18

What does it mean to normalize a vector, and how do I do it?

Accepted Answer

Normalizing means rescaling a vector so its length becomes exactly 1, while keeping its direction. You divide every number by the vector's magnitude. For [3, 4] (length 5): [3/5, 4/5] = [0.6, 0.8], which now has length sqrt(0.36 + 0.64) = sqrt(1) = 1. In code: const m = Math.sqrt(v.reduce((s,x)=>s+x*x,0)); const unit = v.map(x => x/m);. After normalizing, the only thing left is direction — so the dot product of two normalized vectors equals their cosine similarity.

Question 19

How would I find the most similar chunk to a query, in code?

Accepted Answer

Score every chunk and pick the best:
function cosine(a,b){
  const dot = a.reduce((s,x,i)=>s+x*b[i],0);
  const ma = Math.sqrt(a.reduce((s,x)=>s+x*x,0));
  const mb = Math.sqrt(b.reduce((s,x)=>s+x*x,0));
  return dot/(ma*mb);
}
const ranked = chunks
  .map(c => ({c, score: cosine(queryVec, c.vec)}))
  .sort((a,b) => b.score - a.score);
const top3 = ranked.slice(0,3);
That's the heart of a vector search: map to scores, sort descending, take the top. Real vector DBs just do this faster.

Question 20

Why can't I take the dot product of two different-length arrays?

Accepted Answer

Because the dot product pairs up matching positions — a[0]*b[0], a[1]*b[1], and so on. If a has 3 numbers and b has 4, position 3 of b has no partner, so the operation is undefined. In code your loop would read undefined and produce NaN. The rule "shapes must line up" shows up everywhere in this math. For embeddings it means you can only compare vectors from the SAME model — a 768-dim vector and a 1536-dim vector simply can't be compared.

Question 21

Why is raw dot product "biased" by length, and why does that matter?

Accepted Answer

Because dot product grows when either vector is longer, two vectors pointing the same direction can score very differently if one is bigger. Imagine [1,0] dot [1,0] = 1, but [1,0] dot [10,0] = 10 — same direction, ten times the score, purely from length. If chunk lengths vary, raw dot product can rank a longer-but-less-relevant chunk above a shorter perfect match. Cosine fixes this by dividing out length, so you compare meaning fairly. That bias is the whole reason cosine exists.

Question 22

Once I've normalized all my vectors, can I skip cosine and just use dot product?

Accepted Answer

Yes — and that's a common speed trick. If every vector already has length 1, then length(a) * length(b) is 1 * 1 = 1, so cosine = dot product divided by 1 = just the dot product. You get the identical similarity score with fewer operations, because you skip computing the two square roots at query time. Many vector databases normalize on insert for exactly this reason. So: normalize once up front, then run cheap dot products forever after.

Vectors, Embeddings & Similarity