Exercises NotebookMath for LLMs

Linear Models

Math for Specific Models / Linear Models

Run notebook
Exercises Notebook

Exercises Notebook

Converted from exercises.ipynb for web reading.

Linear Models: Exercises

Ten exercises cover affine prediction, least squares, gradients, ridge regularization, logistic and softmax probabilities, standardization, and diagnostics.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Exercise 1: Affine prediction

Compute one affine prediction.

Code cell 4

# Your Solution
x = np.array([2.0, 3.0])
w = np.array([0.5, -1.0])
b = 4.0
print("Starter: w@x+b.")

Code cell 5

# Solution
x = np.array([2.0, 3.0])
w = np.array([0.5, -1.0])
b = 4.0
print("prediction:", w @ x + b)

Exercise 2: Least squares

Solve a tiny least-squares problem.

Code cell 7

# Your Solution
X = np.array([[1., 1.], [1., 2.], [1., 3.]])
y = np.array([1., 2., 2.])
print("Starter: solve X.T@X w = X.T@y.")

Code cell 8

# Solution
X = np.array([[1., 1.], [1., 2.], [1., 3.]])
y = np.array([1., 2., 2.])
w = np.linalg.solve(X.T @ X, X.T @ y)
print("w:", w)

Exercise 3: Squared-loss gradient

Compute gradient X.T@(Xw-y).

Code cell 10

# Your Solution
X = np.eye(2)
y = np.array([1., -1.])
w = np.array([0.2, 0.3])
print("Starter: X.T@(X@w-y).")

Code cell 11

# Solution
X = np.eye(2)
y = np.array([1., -1.])
w = np.array([0.2, 0.3])
grad = X.T @ (X @ w - y)
print("grad:", grad)

Exercise 4: Gradient step

Take one gradient descent step.

Code cell 13

# Your Solution
w = np.array([1.0, -1.0])
grad = np.array([0.5, -0.25])
lr = 0.1
print("Starter: w_new=w-lr*grad.")

Code cell 14

# Solution
w = np.array([1.0, -1.0])
grad = np.array([0.5, -0.25])
lr = 0.1
w_new = w - lr * grad
print("w_new:", w_new)

Exercise 5: Ridge solution

Compute ridge solution for identity X.

Code cell 16

# Your Solution
X = np.eye(2)
y = np.array([2.0, 4.0])
lam = 1.0
print("Starter: solve (X.T@X+lam*I)w=X.T@y.")

Code cell 17

# Solution
X = np.eye(2)
y = np.array([2.0, 4.0])
lam = 1.0
w = np.linalg.solve(X.T @ X + lam * np.eye(2), X.T @ y)
print("w:", w)

Exercise 6: Shrinkage path

Compare ridge coefficients for two lambdas.

Code cell 19

# Your Solution
XtX = np.eye(2)
Xty = np.array([2.0, 4.0])
print("Starter: solve for lambda 0 and 3.")

Code cell 20

# Solution
XtX = np.eye(2)
Xty = np.array([2.0, 4.0])
for lam in [0.0, 3.0]:
    w = np.linalg.solve(XtX + lam * np.eye(2), Xty)
    print(lam, w)

Exercise 7: Logistic probability

Compute sigmoid and BCE.

Code cell 22

# Your Solution
z = 1.5
y = 1
print("Starter: p=1/(1+exp(-z)).")

Code cell 23

# Solution
z = 1.5
y = 1
p = 1 / (1 + np.exp(-z))
loss = -(y*np.log(p)+(1-y)*np.log(1-p))
print("p:", p, "loss:", loss)

Exercise 8: Softmax

Compute softmax probabilities.

Code cell 25

# Your Solution
logits = np.array([1.0, 2.0, 0.0])
print("Starter: subtract max, exp, normalize.")

Code cell 26

# Solution
logits = np.array([1.0, 2.0, 0.0])
e = np.exp(logits - logits.max())
p = e / e.sum()
print("p:", p)

Exercise 9: Standardization

Standardize using train mean and std.

Code cell 28

# Your Solution
train = np.array([1.0, 2.0, 3.0])
test = np.array([4.0])
print("Starter: use train mean/std on both train and test.")

Code cell 29

# Solution
train = np.array([1.0, 2.0, 3.0])
test = np.array([4.0])
mu, sigma = train.mean(), train.std()
print("train standardized:", (train - mu) / sigma)
print("test standardized:", (test - mu) / sigma)

Exercise 10: Checklist

Write four linear-model diagnostics.

Code cell 31

# Your Solution
print("Starter: include shapes, scaling, rank, residuals.")

Code cell 32

# Solution
checks = [
    "X, y, and w shapes match",
    "scaling uses train-only statistics",
    "rank and condition number are inspected",
    "residuals are plotted on validation data",
]
for check in checks:
    print("-", check)

Closing Reflection

Linear models are the best place to learn the full supervised-learning loop because every shape, gradient, regularizer, and diagnostic is visible.