Exercises Notebook
Converted from
exercises.ipynbfor web reading.
Fine-Tuning Math: Exercises
Ten exercises cover the practical math of fine-tuning: masked SFT loss, KL movement, LoRA shapes and counts, adapter and prefix counts, DPO loss, and evaluation checklists.
Code cell 2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
try:
import seaborn as sns
sns.set_theme(style="whitegrid", palette="colorblind")
HAS_SNS = True
except ImportError:
plt.style.use("seaborn-v0_8-whitegrid")
HAS_SNS = False
mpl.rcParams.update({
"figure.figsize": (10, 6),
"figure.dpi": 120,
"font.size": 13,
"axes.titlesize": 15,
"axes.labelsize": 13,
"xtick.labelsize": 11,
"ytick.labelsize": 11,
"legend.fontsize": 11,
"legend.framealpha": 0.85,
"lines.linewidth": 2.0,
"axes.spines.top": False,
"axes.spines.right": False,
"savefig.bbox": "tight",
"savefig.dpi": 150,
})
np.random.seed(42)
print("Plot setup complete.")
Exercise 1: Answer-only SFT loss
Use a mask to average only response-token losses.
Code cell 4
# Your Solution
losses = np.array([0.1, 0.2, 1.0, 0.8])
mask = np.array([0, 0, 1, 1])
print("Starter: sum losses * mask and divide by mask.sum().")
Code cell 5
# Solution
losses = np.array([0.1, 0.2, 1.0, 0.8])
mask = np.array([0, 0, 1, 1])
answer_loss = (losses * mask).sum() / mask.sum()
print("answer loss:", answer_loss)
Exercise 2: KL movement
Compute KL between tuned and base categorical distributions.
Code cell 7
# Your Solution
base = np.array([0.6, 0.3, 0.1])
tuned = np.array([0.4, 0.4, 0.2])
print("Starter: sum tuned * (log tuned - log base).")
Code cell 8
# Solution
base = np.array([0.6, 0.3, 0.1])
tuned = np.array([0.4, 0.4, 0.2])
kl = np.sum(tuned * (np.log(tuned) - np.log(base)))
print("KL:", kl)
assert kl >= 0
Exercise 3: LoRA parameter count
Count rank-8 LoRA parameters for a 4096 by 4096 matrix.
Code cell 10
# Your Solution
d_in = d_out = 4096
r = 8
print("Starter: r * (d_in + d_out).")
Code cell 11
# Solution
d_in = d_out = 4096
r = 8
count = r * (d_in + d_out)
print("LoRA params:", count)
Exercise 4: LoRA update shape
Verify has the same shape as W.
Code cell 13
# Your Solution
d_in, d_out, r = 10, 12, 3
print("Starter: A shape is (r,d_in), B shape is (d_out,r).")
Code cell 14
# Solution
d_in, d_out, r = 10, 12, 3
A = np.zeros((r, d_in))
B = np.zeros((d_out, r))
delta = B @ A
print("delta shape:", delta.shape)
assert delta.shape == (d_out, d_in)
Exercise 5: Rank approximation
Compute relative error for a rank-2 SVD approximation.
Code cell 16
# Your Solution
M = np.arange(25, dtype=float).reshape(5, 5)
print("Starter: take SVD, keep first two singular values.")
Code cell 17
# Solution
M = np.arange(25, dtype=float).reshape(5, 5)
U, S, Vt = np.linalg.svd(M, full_matrices=False)
approx = (U[:, :2] * S[:2]) @ Vt[:2]
err = np.linalg.norm(M - approx) / np.linalg.norm(M)
print("relative error:", err)
Exercise 6: Adapter count
Count bottleneck adapter parameters for d=1024 and bottleneck=32.
Code cell 19
# Your Solution
d = 1024
b = 32
print("Starter: down projection plus up projection.")
Code cell 20
# Solution
d = 1024
b = 32
count = d * b + b * d
print("adapter params:", count)
Exercise 7: Prefix count
Count KV prefix parameters.
Code cell 22
# Your Solution
layers, prefix_len, heads, head_dim = 24, 16, 16, 64
print("Starter: multiply layers, prefix_len, heads, head_dim, and 2 for K/V.")
Code cell 23
# Solution
layers, prefix_len, heads, head_dim = 24, 16, 16, 64
count = layers * prefix_len * heads * head_dim * 2
print("prefix params:", count)
Exercise 8: DPO loss
Compute DPO loss from model/reference chosen/rejected log probabilities.
Code cell 25
# Your Solution
beta = 0.2
theta_chosen, theta_rejected = -4.0, -6.0
ref_chosen, ref_rejected = -5.0, -5.5
print("Starter: margin=(theta_c-theta_r)-(ref_c-ref_r).")
Code cell 26
# Solution
beta = 0.2
theta_chosen, theta_rejected = -4.0, -6.0
ref_chosen, ref_rejected = -5.0, -5.5
margin = (theta_chosen - theta_rejected) - (ref_chosen - ref_rejected)
loss = -np.log(1 / (1 + np.exp(-beta * margin)))
print("margin:", margin)
print("loss:", loss)
Exercise 9: Scorecard
Combine task quality and retention into a weighted score.
Code cell 28
# Your Solution
task = 0.80
retention = 0.72
print("Starter: choose weights and compute weighted sum.")
Code cell 29
# Solution
task = 0.80
retention = 0.72
score = 0.7 * task + 0.3 * retention
print("combined score:", score)
Exercise 10: Checklist
Write four checks before trusting a fine-tune.
Code cell 31
# Your Solution
print("Starter: include masks, trainable params, base comparison, and retention.")
Code cell 32
# Solution
checks = [
"answer mask excludes prompt and padding tokens",
"trainable parameter count matches the intended method",
"base and prompt-only baselines are evaluated",
"retention tasks are checked after adaptation",
]
for check in checks:
print("-", check)
assert len(checks) == 4
Closing Reflection
Fine-tuning is not just "run training." It is a choice of update space, objective, data mask, optimizer budget, and evaluation contract.