Exercises Notebook
Converted from
exercises.ipynbfor web reading.
RAG Math and Retrieval: Exercises
Ten exercises cover vector similarity, sparse scoring, contrastive loss, retrieval metrics, diversity, context packing, fusion, and trace diagnostics.
Code cell 2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
try:
import seaborn as sns
sns.set_theme(style="whitegrid", palette="colorblind")
HAS_SNS = True
except ImportError:
plt.style.use("seaborn-v0_8-whitegrid")
HAS_SNS = False
mpl.rcParams.update({
"figure.figsize": (10, 6),
"figure.dpi": 120,
"font.size": 13,
"axes.titlesize": 15,
"axes.labelsize": 13,
"xtick.labelsize": 11,
"ytick.labelsize": 11,
"legend.fontsize": 11,
"legend.framealpha": 0.85,
"lines.linewidth": 2.0,
"axes.spines.top": False,
"axes.spines.right": False,
"savefig.bbox": "tight",
"savefig.dpi": 150,
})
np.random.seed(42)
print("Plot setup complete.")
Exercise 1: Cosine similarity
Normalize two vectors and compute cosine.
Code cell 4
# Your Solution
a = np.array([1.0, 1.0])
b = np.array([2.0, 0.0])
print("Starter: dot(a,b)/(norm(a)*norm(b)).")
Code cell 5
# Solution
a = np.array([1.0, 1.0])
b = np.array([2.0, 0.0])
cos = a @ b / (np.linalg.norm(a) * np.linalg.norm(b))
print("cosine:", cos)
Exercise 2: Top-k retrieval
Return the indices of the top-2 scores.
Code cell 7
# Your Solution
scores = np.array([0.2, 0.9, 0.4, 0.8])
print("Starter: argsort scores and take last two.")
Code cell 8
# Solution
scores = np.array([0.2, 0.9, 0.4, 0.8])
top2 = np.argsort(scores)[-2:][::-1]
print("top2:", top2)
Exercise 3: BM25-style term score
Compute saturated term-frequency score.
Code cell 10
# Your Solution
tf, idf, k1 = 3, 1.2, 1.5
print("Starter: idf * tf*(k1+1)/(tf+k1).")
Code cell 11
# Solution
tf, idf, k1 = 3, 1.2, 1.5
score = idf * tf * (k1 + 1) / (tf + k1)
print("score:", score)
Exercise 4: Contrastive loss
Compute loss for scores with positive at index 0.
Code cell 13
# Your Solution
scores = np.array([2.0, 1.0, 0.0])
print("Starter: -log(exp(score0)/sum exp(scores)).")
Code cell 14
# Solution
scores = np.array([2.0, 1.0, 0.0])
e = np.exp(scores - scores.max())
loss = -np.log(e[0] / e.sum())
print("loss:", loss)
Exercise 5: Recall@k
Compute Recall@3 for a ranked list.
Code cell 16
# Your Solution
ranked = ["a", "b", "c"]
relevant = {"c", "d"}
print("Starter: intersection size divided by relevant size.")
Code cell 17
# Solution
ranked = ["a", "b", "c"]
relevant = {"c", "d"}
recall = len(set(ranked) & relevant) / len(relevant)
print("Recall@3:", recall)
Exercise 6: MRR
Compute reciprocal rank for first relevant doc.
Code cell 19
# Your Solution
ranked = ["a", "b", "c"]
relevant = {"c"}
print("Starter: reciprocal of first rank containing relevant doc.")
Code cell 20
# Solution
ranked = ["a", "b", "c"]
relevant = {"c"}
rr = next((1 / i for i, d in enumerate(ranked, 1) if d in relevant), 0)
print("RR:", rr)
Exercise 7: MMR score
Compute one candidate MMR score.
Code cell 22
# Your Solution
rel = 0.9
max_sim_to_selected = 0.6
lam = 0.7
print("Starter: lam*rel - (1-lam)*max_sim.")
Code cell 23
# Solution
rel = 0.9
max_sim_to_selected = 0.6
lam = 0.7
score = lam * rel - (1 - lam) * max_sim_to_selected
print("MMR score:", score)
Exercise 8: Pack chunks
Greedily pack chunks under a token budget.
Code cell 25
# Your Solution
lengths = np.array([100, 250, 200])
budget = 300
print("Starter: add chunks while used+length <= budget.")
Code cell 26
# Solution
lengths = np.array([100, 250, 200])
budget = 300
used = 0
chosen = []
for i, l in enumerate(lengths):
if used + l <= budget:
chosen.append(i)
used += l
print("chosen:", chosen, "used:", used)
Exercise 9: RRF
Compute reciprocal-rank contribution for rank 3 with k=60.
Code cell 28
# Your Solution
rank = 3
k = 60
print("Starter: 1/(k+rank).")
Code cell 29
# Solution
rank = 3
k = 60
score = 1 / (k + rank)
print("RRF contribution:", score)
Exercise 10: Trace checklist
Write four items that a RAG trace should log.
Code cell 31
# Your Solution
print("Starter: include query, retrieved chunks, prompt, and answer citations.")
Code cell 32
# Solution
checks = [
"query text and embedding norm",
"retrieved chunk ids, scores, and text",
"final packed prompt",
"answer claims mapped to citations",
]
for check in checks:
print("-", check)
Closing Reflection
RAG quality is a product of retrieval, ranking, context packing, and generation. Debug them separately before changing everything at once.