Exercises Notebook

Converted from exercises.ipynb for web reading.

Fine-Tuning Math: Exercises

Ten exercises cover the practical math of fine-tuning: masked SFT loss, KL movement, LoRA shapes and counts, adapter and prefix counts, DPO loss, and evaluation checklists.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Exercise 1: Answer-only SFT loss

Use a mask to average only response-token losses.

Code cell 4

# Your Solution
losses = np.array([0.1, 0.2, 1.0, 0.8])
mask = np.array([0, 0, 1, 1])
print("Starter: sum losses * mask and divide by mask.sum().")

Code cell 5

# Solution
losses = np.array([0.1, 0.2, 1.0, 0.8])
mask = np.array([0, 0, 1, 1])
answer_loss = (losses * mask).sum() / mask.sum()
print("answer loss:", answer_loss)

Exercise 2: KL movement

Compute KL between tuned and base categorical distributions.

Code cell 7

# Your Solution
base = np.array([0.6, 0.3, 0.1])
tuned = np.array([0.4, 0.4, 0.2])
print("Starter: sum tuned * (log tuned - log base).")

Code cell 8

# Solution
base = np.array([0.6, 0.3, 0.1])
tuned = np.array([0.4, 0.4, 0.2])
kl = np.sum(tuned * (np.log(tuned) - np.log(base)))
print("KL:", kl)
assert kl >= 0

Exercise 3: LoRA parameter count

Count rank-8 LoRA parameters for a 4096 by 4096 matrix.

Code cell 10

# Your Solution
d_in = d_out = 4096
r = 8
print("Starter: r * (d_in + d_out).")

Code cell 11

# Solution
d_in = d_out = 4096
r = 8
count = r * (d_in + d_out)
print("LoRA params:", count)

Exercise 4: LoRA update shape

Verify $BA$ has the same shape as W.

Code cell 13

# Your Solution
d_in, d_out, r = 10, 12, 3
print("Starter: A shape is (r,d_in), B shape is (d_out,r).")

Code cell 14

# Solution
d_in, d_out, r = 10, 12, 3
A = np.zeros((r, d_in))
B = np.zeros((d_out, r))
delta = B @ A
print("delta shape:", delta.shape)
assert delta.shape == (d_out, d_in)

Exercise 5: Rank approximation

Compute relative error for a rank-2 SVD approximation.

Code cell 16

# Your Solution
M = np.arange(25, dtype=float).reshape(5, 5)
print("Starter: take SVD, keep first two singular values.")

Code cell 17

# Solution
M = np.arange(25, dtype=float).reshape(5, 5)
U, S, Vt = np.linalg.svd(M, full_matrices=False)
approx = (U[:, :2] * S[:2]) @ Vt[:2]
err = np.linalg.norm(M - approx) / np.linalg.norm(M)
print("relative error:", err)

Exercise 6: Adapter count

Count bottleneck adapter parameters for d=1024 and bottleneck=32.

Code cell 19

# Your Solution
d = 1024
b = 32
print("Starter: down projection plus up projection.")

Code cell 20

# Solution
d = 1024
b = 32
count = d * b + b * d
print("adapter params:", count)

Exercise 7: Prefix count

Count KV prefix parameters.

Code cell 22

# Your Solution
layers, prefix_len, heads, head_dim = 24, 16, 16, 64
print("Starter: multiply layers, prefix_len, heads, head_dim, and 2 for K/V.")

Code cell 23

# Solution
layers, prefix_len, heads, head_dim = 24, 16, 16, 64
count = layers * prefix_len * heads * head_dim * 2
print("prefix params:", count)

Exercise 8: DPO loss

Compute DPO loss from model/reference chosen/rejected log probabilities.

Code cell 25

# Your Solution
beta = 0.2
theta_chosen, theta_rejected = -4.0, -6.0
ref_chosen, ref_rejected = -5.0, -5.5
print("Starter: margin=(theta_c-theta_r)-(ref_c-ref_r).")

Code cell 26

# Solution
beta = 0.2
theta_chosen, theta_rejected = -4.0, -6.0
ref_chosen, ref_rejected = -5.0, -5.5
margin = (theta_chosen - theta_rejected) - (ref_chosen - ref_rejected)
loss = -np.log(1 / (1 + np.exp(-beta * margin)))
print("margin:", margin)
print("loss:", loss)

Exercise 9: Scorecard

Combine task quality and retention into a weighted score.

Code cell 28

# Your Solution
task = 0.80
retention = 0.72
print("Starter: choose weights and compute weighted sum.")

Code cell 29

# Solution
task = 0.80
retention = 0.72
score = 0.7 * task + 0.3 * retention
print("combined score:", score)

Exercise 10: Checklist

Write four checks before trusting a fine-tune.

Code cell 31

# Your Solution
print("Starter: include masks, trainable params, base comparison, and retention.")

Code cell 32

# Solution
checks = [
    "answer mask excludes prompt and padding tokens",
    "trainable parameter count matches the intended method",
    "base and prompt-only baselines are evaluated",
    "retention tasks are checked after adaptation",
]
for check in checks:
    print("-", check)
assert len(checks) == 4

Closing Reflection

Fine-tuning is not just "run training." It is a choice of update space, objective, data mask, optimizer budget, and evaluation contract.

Fine Tuning Math