Vector Spaces SubspacesMath for LLMs

Vector Spaces Subspaces

Linear Algebra Basics

Private notes
0/8000

Notes stay private to your browser until account sync is configured.

Vector Spaces Subspaces

Exercises Notebook

Converted from exercises.ipynb for web reading.

Vector Spaces Subspaces - Exercises

10 graded exercises covering the full linear algebra basics arc, from computation to ML-facing matrix workflows.

FormatDescription
ProblemMarkdown cell with task description
Your SolutionCode cell for learner work
SolutionReference solution with checks

Difficulty: straightforward -> moderate -> challenging.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import numpy.linalg as la
import scipy.linalg as sla
from scipy import stats

COLORS = {
    "primary": "#0077BB",
    "secondary": "#EE7733",
    "tertiary": "#009988",
    "error": "#CC3311",
    "neutral": "#555555",
    "highlight": "#EE3377",
}
HAS_MPL = True
np.set_printoptions(precision=8, suppress=True)
np.random.seed(42)

def header(title):
    print("\n" + "=" * len(title))
    print(title)
    print("=" * len(title))

def check_true(name, cond):
    ok = bool(cond)
    print(f"{'PASS' if ok else 'FAIL'} - {name}")
    return ok

def check_close(name, got, expected, tol=1e-8):
    ok = np.allclose(got, expected, atol=tol, rtol=tol)
    print(f"{'PASS' if ok else 'FAIL'} - {name}: got {got}, expected {expected}")
    return ok

def check(name, got, expected, tol=1e-8):
    return check_close(name, got, expected, tol=tol)

def softmax(z, axis=-1, tau=1.0):
    z = np.asarray(z, dtype=float) / float(tau)
    z = z - np.max(z, axis=axis, keepdims=True)
    e = np.exp(z)
    return e / np.sum(e, axis=axis, keepdims=True)

def cosine_similarity(a, b):
    a = np.asarray(a, dtype=float); b = np.asarray(b, dtype=float)
    return float(a @ b / (la.norm(a) * la.norm(b) + 1e-12))

def numerical_rank(A, tol=1e-10):
    return int(np.sum(la.svd(A, compute_uv=False) > tol))

def orthonormal_basis(A, tol=1e-10):
    Q, R = la.qr(A)
    keep = np.abs(np.diag(R)) > tol
    return Q[:, keep]

def null_space(A, tol=1e-10):
    U, S, Vt = la.svd(A)
    return Vt[S.size:,:].T if S.size < Vt.shape[0] else Vt[S <= tol,:].T



# Compatibility helpers used by the Chapter 02 theory and exercise cells.
def null_space(A, tol=1e-10):
    A = np.asarray(A, dtype=float)
    U, S, Vt = la.svd(A, full_matrices=True)
    rank = int(np.sum(S > tol))
    return Vt[rank:].T

svd_null_space = null_space

def gram_schmidt(vectors, tol=1e-10):
    A = np.asarray(vectors, dtype=float)
    if A.ndim == 1:
        A = A.reshape(1, -1)
    basis = []
    for v in A:
        w = v.astype(float).copy()
        for q in basis:
            w = w - np.dot(w, q) * q
        norm = la.norm(w)
        if norm > tol:
            basis.append(w / norm)
    return np.array(basis)

def projection_matrix_from_columns(A, tol=1e-10):
    Q = orthonormal_basis(np.asarray(A, dtype=float), tol=tol)
    return Q @ Q.T


def random_unit_vectors(n, d):
    X = np.random.randn(n, d)
    return X / np.maximum(la.norm(X, axis=1, keepdims=True), 1e-12)

def pairwise_distances(X):
    X = np.asarray(X, dtype=float)
    diff = X[:, None, :] - X[None, :, :]
    return la.norm(diff, axis=-1)


def normalize(x, axis=None, tol=1e-12):
    x = np.asarray(x, dtype=float)
    norm = la.norm(x, axis=axis, keepdims=True)
    return x / np.maximum(norm, tol)

def frobenius_inner(A, B):
    return float(np.sum(np.asarray(A, dtype=float) * np.asarray(B, dtype=float)))

def outer_sum_product(A, B):
    A = np.asarray(A, dtype=float)
    B = np.asarray(B, dtype=float)
    return sum(np.outer(A[:, k], B[k, :]) for k in range(A.shape[1]))

def softmax_rows(X):
    return softmax(X, axis=1)

def col_space(A, tol=1e-10):
    return orthonormal_basis(np.asarray(A, dtype=float), tol=tol)

def row_space(A, tol=1e-10):
    return orthonormal_basis(np.asarray(A, dtype=float).T, tol=tol).T

def rref(A, tol=1e-10):
    R = np.array(A, dtype=float, copy=True)
    m, n = R.shape
    pivots = []
    row = 0
    for col in range(n):
        pivot = row + int(np.argmax(np.abs(R[row:, col]))) if row < m else row
        if row >= m or abs(R[pivot, col]) <= tol:
            continue
        if pivot != row:
            R[[row, pivot]] = R[[pivot, row]]
        R[row] = R[row] / R[row, col]
        for r in range(m):
            if r != row:
                R[r] = R[r] - R[r, col] * R[row]
        pivots.append(col)
        row += 1
        if row == m:
            break
    R[np.abs(R) < tol] = 0.0
    return R, pivots

def nullspace_basis(A, tol=1e-10):
    A = np.asarray(A, dtype=float)
    U, S, Vt = la.svd(A, full_matrices=True)
    rank = int(np.sum(S > tol))
    return Vt[rank:].T, rank

print("Chapter helper setup complete.")

Exercise 1: Verifying Vector Space Axioms *

The eight axioms are the complete definition of a vector space. A single axiom failure disqualifies the entire structure. Working through non-examples is equally important — it builds the instinct for which axiom is the weakest link in a given candidate.

For each of the five candidates below, determine whether it is a vector space with the given operations. If it is a vector space, identify the zero vector and one additive inverse. If it is not, identify the specific axiom(s) that fail and supply a concrete counterexample.

(a) V=R2V = \mathbb{R}^2 with standard addition but modified scalar multiplication:

α(u1,u2)=(αu1,  u2)\alpha \odot (u_1, u_2) = (\alpha u_1,\; u_2)

(scalar multiplication only scales the first coordinate).

(b) V=R>0V = \mathbb{R}_{>0} (positive reals) with operations:

uv=uv(multiplication as addition),αu=uα(exponentiation as scalar mult).u \oplus v = uv \quad\text{(multiplication as addition)}, \qquad \alpha \odot u = u^\alpha \quad\text{(exponentiation as scalar mult)}.

(c) V={(x,y,z)R3:x+y+z=1}V = \{(x, y, z) \in \mathbb{R}^3 : x + y + z = 1\} with the standard operations inherited from R3\mathbb{R}^3.

(d) V={f:RR:f(0)=0}V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 0\} with standard function addition and scalar multiplication.

(e) V={f:RR:f(0)=1}V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 1\} with standard operations.

Approach: for each case, work through the three-condition test (zero vector, closure under ++, closure under scalar mult) and check the eight axioms systematically. Use concrete vectors and scalars to build or destroy the axioms.

Code cell 5

# Your Solution
# Exercise 1 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 1.")

Code cell 6

# Solution
# Exercise 1 - reference solution

header('Exercise 1(a): Modified scalar multiplication')
# Axiom 7: alpha*(u+v) = alpha*u + alpha*v?
# alpha*(u1+v1, u2+v2) = (alpha*(u1+v1), u2+v2)
# alpha*u + alpha*v = (alpha*u1, u2) + (alpha*v1, v2) = (alpha*(u1+v1), u2+v2) -- OK here
# Axiom 8: (alpha+beta)*v = alpha*v + beta*v?
# (alpha+beta)*(u1,u2) = ((alpha+beta)*u1, u2)
# alpha*(u1,u2) + beta*(u1,u2) = (alpha*u1, u2) + (beta*u1, u2) = ((alpha+beta)*u1, u2+u2) = ((alpha+beta)*u1, 2*u2)
# FAIL when u2 != 0: u2 != 2*u2
u1, u2 = 3.0, 5.0
alpha, beta = 2.0, 3.0
lhs = ((alpha + beta) * u1, u2)           # (alpha+beta) scalar mult
rhs = (alpha * u1 + beta * u1, u2 + u2)   # alpha*v + beta*v
print(f'(alpha+beta)*(u1,u2) = {lhs}')
print(f'alpha*(u1,u2) + beta*(u1,u2) = {rhs}')
check_true('Axiom 8 (vector distributivity) fails', lhs[1] != rhs[1])
print('VERDICT: NOT a vector space. Axiom 8 fails because the second coordinate is unaffected')
print('by scalar multiplication, so scaling and adding disagree.')

header('Exercise 1(b): (R_>0, multiplication, exponentiation)')
# "Zero" vector: e = 1 (since u*1 = u for all u > 0)
# Additive inverse: u^{-1} = 1/u (since u*(1/u) = 1 = e)
# Scalar mult identity: 1 . u = u^1 = u  (Axiom 5 OK)
# Axiom 8: (alpha+beta) . u = u^{alpha+beta}; alpha.u (+) beta.u = u^alpha * u^beta = u^{alpha+beta}  OK
# Axiom 7: alpha.(u (+) v) = (uv)^alpha; alpha.u (+) alpha.v = u^alpha * v^alpha = (uv)^alpha  OK
# Axiom 6: alpha.(beta.u) = alpha.(u^beta) = (u^beta)^alpha = u^{alpha*beta}; (alpha*beta).u = u^{alpha*beta}  OK
u, v, al, be = 4.0, 9.0, 2.0, 3.0
e = 1.0  # zero vector
check_close('Additive identity: u (+) 1 = u', u * e, u)
check_close('Additive inverse: u (+) u^-1 = 1', u * (1.0 / u), e)
check_close('Axiom 8: (al+be).u = al.u (+) be.u', u**(al + be), (u**al) * (u**be))
check_close('Axiom 7: al.(u (+) v) = al.u (+) al.v', (u * v)**al, (u**al) * (v**al))
check_close('Axiom 6: al.(be.u) = (al*be).u', (u**be)**al, u**(al * be))
check_close('Commutativity: u (+) v = v (+) u', u * v, v * u)
check_true('Zero vector is 1 (not 0)', e == 1.0)
print('VERDICT: IS a vector space. Every axiom holds; the structure is isomorphic to (R, +, *) via log/exp.')

header('Exercise 1(c): {x+y+z=1} with standard operations')
p = np.array([1.0, 0.0, 0.0])   # p[0]+p[1]+p[2] = 1
q = np.array([0.0, 1.0, 0.0])   # q[0]+q[1]+q[2] = 1
zero = np.array([0.0, 0.0, 0.0])
check_true('Zero vector in V? (0+0+0=1?)', np.isclose(zero.sum(), 1.0))
print(f'p + q = {p + q}, sum = {(p+q).sum()}  (should be 1 to stay in V, is 2)')
check_true('p+q NOT in V (sum=2 != 1)', not np.isclose((p + q).sum(), 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero vector (0+0+0=0 != 1).')
print('Also not closed under addition (sum would be 2). It is an affine subspace.')

header('Exercise 1(d): {f: f(0)=0} with standard function ops')
# Represent functions by their values on a grid
x = np.linspace(-2, 2, 9)
f = np.sin(x)           # f(0) = sin(0) = 0  OK
g = x**3 - x            # g(0) = 0  OK
alpha = 3.7
check_close('f(0) = 0', f[4], 0.0)   # x[4] = 0
check_close('g(0) = 0', g[4], 0.0)
check_close('(f+g)(0) = 0', (f + g)[4], 0.0)
check_close('(alpha*f)(0) = 0', (alpha * f)[4], 0.0)
check_close('zero function in V: 0(0)=0', 0.0, 0.0)
print('VERDICT: IS a vector space (subspace of all functions R->R).')
print('All three conditions hold: zero function, closure under addition, closure under scaling.')

header('Exercise 1(e): {f: f(0)=1} with standard function ops')
f = 1 + 0 * x           # f(0) = 1
g = np.cos(x)            # g(0) = 1
zero_fn = 0 * x          # zero function: 0(0) = 0
check_true('Zero function NOT in V (0(0)=0 != 1)', not np.isclose(zero_fn[4], 1.0))
print(f'(f+g)(0) = {(f+g)[4]}  (should be 1 to stay in V, is 2)')
check_true('f+g NOT in V ((f+g)(0)=2 != 1)', not np.isclose((f + g)[4], 1.0))
print('VERDICT: NOT a vector space. Does not contain the zero function. Not closed under addition.')
print('This is an affine subspace (a coset of {f: f(0)=0}) — offset by the constant 1 at 0.')

print("Exercise 1 solution complete.")

Exercise 2: Subspace Verification — Proof and Counterexample **

Knowing the three-condition test is not enough; you must also know when the test succeeds and why one condition can fail while others hold. For each subset below, determine whether it is a subspace of the given vector space. If it is a subspace: find a basis and state the dimension. If it is not: identify which condition fails and produce a concrete counterexample.

(a) W={(x,y,z)R3:2xy+3z=0}R3W = \{(x, y, z) \in \mathbb{R}^3 : 2x - y + 3z = 0\} \subseteq \mathbb{R}^3.

(b) W={(x,y)R2:xy=0}W = \{(x, y) \in \mathbb{R}^2 : xy = 0\} (the union of the two coordinate axes) R2\subseteq \mathbb{R}^2.

(c) W={AR2×2:tr(A)=0}W = \{A \in \mathbb{R}^{2 \times 2} : \operatorname{tr}(A) = 0\} (traceless matrices) R2×2\subseteq \mathbb{R}^{2 \times 2}.

(d) W={pP3:p(1)=0}W = \{p \in P_3 : p(1) = 0\} (degree-3\leq 3 polynomials that vanish at 1) P3\subseteq P_3.

(e) W={(x,y,z)R3:x2+y2=z2}W = \{(x, y, z) \in \mathbb{R}^3 : x^2 + y^2 = z^2\} (the double cone) R3\subseteq \mathbb{R}^3.

Code cell 8

# Your Solution
# Exercise 2 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 2.")

Code cell 9

# Solution
# Exercise 2 - reference solution

header('Exercise 2(a): W = {2x - y + 3z = 0}')
# This is the null space of A = [2, -1, 3]. Always a subspace.
A2a = np.array([[2.0, -1.0, 3.0]])
R2a, pivots2a = rref(A2a)
# 3 variables, 1 pivot => 2 free variables => dim = 2
# Free vars: y=s (col1), z=t (col2); then x = (y - 3z)/2 = (s - 3t)/2
# Basis vectors:
b1 = np.array([0.5, 1.0, 0.0])   # s=1, t=0: x=1/2
b2 = np.array([-1.5, 0.0, 1.0])  # s=0, t=1: x=-3/2
check_close('b1 in W (2x-y+3z=0)', 2*b1[0] - b1[1] + 3*b1[2], 0.0)
check_close('b2 in W (2x-y+3z=0)', 2*b2[0] - b2[1] + 3*b2[2], 0.0)
check_close('b1+b2 in W', 2*(b1+b2)[0] - (b1+b2)[1] + 3*(b1+b2)[2], 0.0)
check_close('zero in W', 2*0 - 0 + 3*0, 0.0)
print(f'Basis: {b1}, {b2}')
print('VERDICT: IS a subspace (null space of [2,-1,3]). dim = 2.')

header('Exercise 2(b): W = {xy = 0} (coordinate axes)')
u2b = np.array([1.0, 0.0])   # on x-axis: xy = 0
v2b = np.array([0.0, 1.0])   # on y-axis: xy = 0
s2b = u2b + v2b              # sum = (1,1)
print(f'u = {u2b}, u[0]*u[1] = {u2b[0]*u2b[1]}')
print(f'v = {v2b}, v[0]*v[1] = {v2b[0]*v2b[1]}')
print(f'u+v = {s2b}, (u+v)[0]*(u+v)[1] = {s2b[0]*s2b[1]}')
check_true('u+v NOT in W (product != 0)', not np.isclose(s2b[0] * s2b[1], 0.0))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0)+(0,1)=(1,1), and 1*1=1≠0.')
print('The union of two subspaces is almost never a subspace.')

header('Exercise 2(c): W = {tr(A) = 0} in R^{2x2}')
# Basis: three linearly independent traceless matrices
E12 = np.array([[0.0, 1.0], [0.0, 0.0]])  # tr=0
E21 = np.array([[0.0, 0.0], [1.0, 0.0]])  # tr=0
D   = np.array([[1.0, 0.0], [0.0, -1.0]]) # tr=0
for name2c, M2c in [('E12', E12), ('E21', E21), ('D', D)]:
    check_close(f'tr({name2c}) = 0', np.trace(M2c), 0.0)
check_close('tr(E12 + E21) = 0', np.trace(E12 + E21), 0.0)
check_close('tr(3.7*D) = 0', np.trace(3.7 * D), 0.0)
# Linear independence: form matrix of flattened basis vectors
basis_matrix = np.vstack([E12.ravel(), E21.ravel(), D.ravel()])
check_true('Three basis matrices are linearly independent', np.linalg.matrix_rank(basis_matrix) == 3)
print('Basis: {E12, E21, D}  (three traceless 2x2 matrices)')
print('VERDICT: IS a subspace. dim = 3.  (Full R^{2x2} has dim 4; one trace constraint reduces by 1.)')

header('Exercise 2(d): W = {p in P3 : p(1) = 0}')
# p(1) = a0 + a1 + a2 + a3 = 0  =>  a0 = -a1 - a2 - a3
# Free variables: a1, a2, a3  => dim = 3
# Basis: set one free variable to 1, others to 0:
# a1=1: p(t) = -t + t = t-1  => coefficients [-1, 1, 0, 0]; p(1)= -1+1=0 OK
# a2=1: p(t) = -t^2 + t^2   => coefficients [-1, 0, 1, 0]; p(1)= -1+1=0 OK
# a3=1: p(t) = -t^3 + t^3   => coefficients [-1, 0, 0, 1]; p(1)= -1+1=0 OK
def poly_eval(coeffs, t):
    """Evaluate polynomial with coefficients [a0,a1,...] at t."""
    return sum(c * t**i for i, c in enumerate(coeffs))

b2d_1 = [-1.0,  1.0,  0.0,  0.0]   # t - 1
b2d_2 = [-1.0,  0.0,  1.0,  0.0]   # t^2 - 1
b2d_3 = [-1.0,  0.0,  0.0,  1.0]   # t^3 - 1
for name2d, b2d in [('t-1', b2d_1), ('t^2-1', b2d_2), ('t^3-1', b2d_3)]:
    check_close(f'p(1) = 0 for {name2d}', poly_eval(b2d, 1.0), 0.0)
# Closure: sum of two basis elements
s2d = [b2d_1[i] + b2d_2[i] for i in range(4)]
check_close('(b1+b2)(1) = 0', poly_eval(s2d, 1.0), 0.0)
# Independence: coefficient matrix has full row rank
basis_mat2d = np.array([b2d_1, b2d_2, b2d_3])
check_true('Three basis polynomials are independent', np.linalg.matrix_rank(basis_mat2d) == 3)
print('Basis: {t-1, t^2-1, t^3-1}  (three polynomials vanishing at t=1)')
print('VERDICT: IS a subspace. dim = 3.  (P3 has dim 4; one point-evaluation constraint reduces by 1.)')

header('Exercise 2(e): W = {x^2 + y^2 = z^2} (double cone)')
# On the cone: (1,0,1) has 1+0=1=1^2 OK; (0,1,1) has 0+1=1=1^2 OK
u2e = np.array([1.0, 0.0, 1.0])
v2e = np.array([0.0, 1.0, 1.0])
s2e = u2e + v2e  # = (1, 1, 2)
lhs = s2e[0]**2 + s2e[1]**2   # = 1 + 1 = 2
rhs = s2e[2]**2               # = 4
print(f'u = {u2e}: x^2+y^2 = {u2e[0]**2+u2e[1]**2}, z^2 = {u2e[2]**2}')
print(f'v = {v2e}: x^2+y^2 = {v2e[0]**2+v2e[1]**2}, z^2 = {v2e[2]**2}')
print(f'u+v = {s2e}: x^2+y^2 = {lhs}, z^2 = {rhs}')
check_true('u+v NOT on cone', not np.isclose(lhs, rhs))
print('VERDICT: NOT a subspace. Closure under addition fails: (1,0,1)+(0,1,1)=(1,1,2),')
print('and 1^2+1^2=2 != 4=2^2. The cone is a nonlinear surface, not a flat subspace.')

print("Exercise 2 solution complete.")

Exercise 3: The Four Fundamental Subspaces **

For every matrix ARm×nA \in \mathbb{R}^{m \times n} there are exactly four canonical subspaces. They come in two orthogonal pairs and their dimensions satisfy the Rank-Nullity theorem.

Let

A=(120100131214)R3×4.A = \begin{pmatrix} 1 & 2 & 0 & 1 \\ 0 & 0 & 1 & 3 \\ 1 & 2 & 1 & 4 \end{pmatrix} \in \mathbb{R}^{3 \times 4}.

(a) Row-reduce AA to RREF. Identify pivot columns and free columns.

(b) Find a basis for null(A)\operatorname{null}(A). Verify via the Rank-Nullity theorem: rank(A)+nullity(A)=4\operatorname{rank}(A) + \operatorname{nullity}(A) = 4.

(c) Find a basis for col(A)\operatorname{col}(A). (Use pivot columns of the original AA, not RREF.)

(d) Find a basis for row(A)\operatorname{row}(A). (Use non-zero rows of RREF.)

(e) Find a basis for null(A)\operatorname{null}(A^\top).

(f) Verify: dim(col(A))+dim(null(A))=3\dim(\operatorname{col}(A)) + \dim(\operatorname{null}(A^\top)) = 3 and dim(row(A))+dim(null(A))=4\dim(\operatorname{row}(A)) + \dim(\operatorname{null}(A)) = 4.

(g) Verify the orthogonality relation row(A)null(A)\operatorname{row}(A) \perp \operatorname{null}(A): check that every basis vector of row(A)\operatorname{row}(A) is orthogonal to every basis vector of null(A)\operatorname{null}(A).

Code cell 11

# Your Solution
# Exercise 3 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 3.")

Code cell 12

# Solution
# Exercise 3 - reference solution

A3 = np.array([
    [1.0, 2.0, 0.0, 1.0],
    [0.0, 0.0, 1.0, 3.0],
    [1.0, 2.0, 1.0, 4.0]
])
m3, n3 = A3.shape

# (a) RREF
R3, pivots3 = rref(A3)
header('Exercise 3(a): RREF')
print('RREF(A) =\n', R3)
print('Pivot columns:', pivots3)
print('Free columns:', [c for c in range(n3) if c not in pivots3])
rank3 = len(pivots3)

# (b) Null space: free variables are columns 1 and 3 (0-indexed)
# From RREF: x1 = -2*x2 - x4, x3 = -3*x4
# Free variable x2=1, x4=0: null vector n1 = (-2, 1, 0, 0)
# Free variable x2=0, x4=1: null vector n2 = (-1, 0, -3, 1)
n3_1 = np.array([-2.0, 1.0,  0.0, 0.0])
n3_2 = np.array([-1.0, 0.0, -3.0, 1.0])
null_basis3 = np.column_stack([n3_1, n3_2])

header('Exercise 3(b): Null space of A')
print('Null space basis (columns):\n', null_basis3)
check_close('A @ n1 = 0', A3 @ n3_1, np.zeros(m3))
check_close('A @ n2 = 0', A3 @ n3_2, np.zeros(m3))
nullity3 = null_basis3.shape[1]
check_true('Rank-Nullity: rank + nullity = 4', rank3 + nullity3 == n3)

# (c) Column space: pivot columns of ORIGINAL A
col_basis3 = A3[:, pivots3]
header('Exercise 3(c): Column space of A')
print('Column space basis (pivot columns of A):\n', col_basis3)
check_true('col(A) basis has rank = rank(A)', np.linalg.matrix_rank(col_basis3) == rank3)

# (d) Row space: non-zero rows of RREF
row_basis3 = R3[~np.all(np.isclose(R3, 0.0), axis=1)]
header('Exercise 3(d): Row space of A')
print('Row space basis (non-zero rows of RREF):\n', row_basis3)
check_true('row(A) has correct dimension', row_basis3.shape[0] == rank3)

# (e) Left null space: null space of A^T
# A^T is 4x3; rank = 2; left null dim = 3 - 2 = 1
# Solve A^T y = 0 via RREF of A^T
Rt, pivots_t = rref(A3.T)
# From RREF of A^T: y3 is free; y1 = y3, y2 = -y3 (read from RREF)
# More directly via SVD:
left_null3, _ = nullspace_basis(A3.T)
header('Exercise 3(e): Left null space (null of A^T)')
print('Left null space basis:\n', left_null3)
check_close('A^T @ left_null = 0', A3.T @ left_null3, np.zeros((n3, left_null3.shape[1])))
left_nullity3 = left_null3.shape[1]

# (f) Dimension checks
header('Exercise 3(f): Dimension verification')
print(f'dim(col(A)) = {rank3},  dim(null(A^T)) = {left_nullity3},  sum = {rank3 + left_nullity3}')
print(f'dim(row(A)) = {rank3},  dim(null(A))   = {nullity3},         sum = {rank3 + nullity3}')
check_true('dim(col(A)) + dim(null(A^T)) = m = 3', rank3 + left_nullity3 == m3)
check_true('dim(row(A)) + dim(null(A))   = n = 4', rank3 + nullity3 == n3)

# (g) Orthogonality: row(A) perp null(A)
header('Exercise 3(g): Orthogonality row(A) perp null(A)')
inner_products = row_basis3 @ null_basis3
print('Row-space basis @ Null-space basis =\n', inner_products)
check_close('All inner products row_i . null_j = 0', inner_products, np.zeros_like(inner_products))
print('\nStrang\'s picture:')
print(f'  R^4 = row(A) [dim {rank3}] ⊕ null(A) [dim {nullity3}]  (orthogonal direct sum)')
print(f'  R^3 = col(A) [dim {rank3}] ⊕ null(A^T) [dim {left_nullity3}]  (orthogonal direct sum)')

print("Exercise 3 solution complete.")

Exercise 4: Span, Linear Independence, and Basis **

Span, independence, and basis are three complementary views of the same underlying geometry. A spanning set may be redundant; an independent set may not span; a basis is both, with no waste.

In R4\mathbb{R}^4, consider the vectors

v1=(1,2,0,1),v2=(0,1,1,2),v3=(1,3,1,3),v4=(2,1,1,1).\mathbf{v}_1 = (1,2,0,1)^\top, \quad \mathbf{v}_2 = (0,1,1,2)^\top, \quad \mathbf{v}_3 = (1,3,1,3)^\top, \quad \mathbf{v}_4 = (2,1,-1,-1)^\top.

(a) Determine whether {v1,v2,v3,v4}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\} are linearly independent. If not, which vectors are dependent?

(b) Find the dimension of span{v1,v2,v3,v4}\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}.

(c) Identify a maximal subset that forms a basis for span{v1,v2,v3,v4}\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}.

(d) Express each dependent vector as a linear combination of the basis vectors.

(e) Is w=(3,5,1,4)\mathbf{w} = (3, 5, 1, 4)^\top in span{v1,v2,v3,v4}\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}? If yes, find its coordinates in the basis from (c). If no, explain why not.

Code cell 14

# Your Solution
# Exercise 4 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 4.")

Code cell 15

# Solution
# Exercise 4 - reference solution

v1 = np.array([1.0, 2.0,  0.0,  1.0])
v2 = np.array([0.0, 1.0,  1.0,  2.0])
v3 = np.array([1.0, 3.0,  1.0,  3.0])
v4 = np.array([2.0, 1.0, -1.0, -1.0])
w  = np.array([3.0, 5.0,  1.0,  4.0])
V4 = np.column_stack([v1, v2, v3, v4])

# (a) and (b): row reduce [v1 | v2 | v3 | v4]
R4, pivots4 = rref(V4)
rank4 = len(pivots4)
free4 = [c for c in range(4) if c not in pivots4]

header('Exercise 4(a)(b): Independence and dimension')
print('RREF([v1|v2|v3|v4]) =\n', R4)
print('Pivot columns (0-indexed):', pivots4)
print('Free columns (0-indexed):', free4)
check_true('Vectors are linearly DEPENDENT (rank < 4)', rank4 < 4)
print(f'dim(span{{v1,v2,v3,v4}}) = {rank4}')

# (c) Basis = pivot columns of original V4
basis4 = V4[:, pivots4]
header('Exercise 4(c): Basis')
print('Basis (pivot columns of V):' )
for i, p in enumerate(pivots4):
    print(f'  b{i+1} = v{p+1} = {V4[:, p]}')
check_true('Basis has correct rank', np.linalg.matrix_rank(basis4) == rank4)

# (d) Express dependent vectors in terms of the pivot columns
# From RREF: column j is expressed in terms of pivot columns via the RREF entries
# For free column c: the c-th column of RREF gives the coefficients
header('Exercise 4(d): Dependent vectors as linear combinations')
for fc in free4:
    # RREF col fc gives pivot_coords such that v_{fc} = sum(pivot_coords[i] * v_{pivot[i]})
    pivot_coords = R4[range(rank4), fc]   # coefficients in front of pivot columns
    reconstruction = sum(pivot_coords[i] * V4[:, pivots4[i]] for i in range(rank4))
    print(f'v{fc+1} = ', end='')
    for i, p in enumerate(pivots4):
        print(f'{pivot_coords[i]:+.4g}*v{p+1} ', end='')
    print()
    check_close(f'v{fc+1} reconstructed correctly', reconstruction, V4[:, fc])

# (e) Is w in span?
header('Exercise 4(e): Is w = (3,5,1,4) in the span?')
# Augment basis with w and check if rank increases
Vaug = np.column_stack([basis4, w])
rank_aug = np.linalg.matrix_rank(Vaug)
is_in_span = (rank_aug == rank4)
check_true('w is in span{v1,...,v4}', is_in_span)
if is_in_span:
    # Solve basis4 @ coords = w using least squares (exact since w is in col(basis4))
    coords4, _, _, _ = np.linalg.lstsq(basis4, w, rcond=None)
    check_close('basis @ coords = w', basis4 @ coords4, w)
    print('Coordinates in basis {v1, v2}:', coords4)
    print(f'w = {coords4[0]:.4g}*v{pivots4[0]+1} + {coords4[1]:.4g}*v{pivots4[1]+1}')

print("Exercise 4 solution complete.")

Exercise 5: Gram-Schmidt Orthogonalisation and Projections **

Gram-Schmidt converts any independent set into an orthonormal basis for the same span. The resulting orthonormal basis makes projections trivial and is the foundation of QR factorisation.

Start from v1=(1,1,0)\mathbf{v}_1 = (1,1,0)^\top, v2=(1,0,1)\mathbf{v}_2 = (1,0,1)^\top, v3=(0,1,1)\mathbf{v}_3 = (0,1,1)^\top.

(a) Verify that {v1,v2,v3}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3\} are linearly independent.

(b) Apply Gram-Schmidt to produce an orthonormal basis {q1,q2,q3}\{\mathbf{q}_1, \mathbf{q}_2, \mathbf{q}_3\}.

(c) Verify: qi=1\|\mathbf{q}_i\| = 1 for each ii, and qi,qj=0\langle \mathbf{q}_i, \mathbf{q}_j \rangle = 0 for iji \neq j.

(d) Express w=(1,2,3)\mathbf{w} = (1, 2, 3)^\top in the orthonormal basis; verify by reconstruction:

w=w,q1q1+w,q2q2+w,q3q3.\mathbf{w} = \langle \mathbf{w}, \mathbf{q}_1 \rangle \mathbf{q}_1 + \langle \mathbf{w}, \mathbf{q}_2 \rangle \mathbf{q}_2 + \langle \mathbf{w}, \mathbf{q}_3 \rangle \mathbf{q}_3.

(e) Construct the orthogonal projection matrix PP onto span{v1,v2}=span{q1,q2}\operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\} = \operatorname{span}\{\mathbf{q}_1, \mathbf{q}_2\}. Verify: P2=PP^2 = P (idempotent) and P=PP^\top = P (symmetric). Compute PwP\mathbf{w} and the residual r=wPw\mathbf{r} = \mathbf{w} - P\mathbf{w}; verify that rspan{v1,v2}\mathbf{r} \perp \operatorname{span}\{\mathbf{v}_1, \mathbf{v}_2\}.

Code cell 17

# Your Solution
# Exercise 5 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 5.")

Code cell 18

# Solution
# Exercise 5 - reference solution

v5_1 = np.array([1.0, 1.0, 0.0])
v5_2 = np.array([1.0, 0.0, 1.0])
v5_3 = np.array([0.0, 1.0, 1.0])
w5   = np.array([1.0, 2.0, 3.0])
V5   = np.column_stack([v5_1, v5_2, v5_3])

# (a) Linear independence: non-zero determinant
det5 = np.linalg.det(V5)
header('Exercise 5(a): Linear independence')
print(f'det([v1|v2|v3]) = {det5:.6f}')
check_true('det != 0 => linearly independent', abs(det5) > 1e-10)

# (b) Gram-Schmidt
def gram_schmidt(vecs):
    qs = []
    for v in vecs:
        u = v.copy()
        for q in qs:
            u = u - np.dot(u, q) * q   # subtract projection onto previously found q
        norm_u = np.linalg.norm(u)
        if norm_u < 1e-12:
            raise ValueError('Input vectors are linearly dependent.')
        qs.append(u / norm_u)
    return np.column_stack(qs)

Q5 = gram_schmidt([v5_1, v5_2, v5_3])

header('Exercise 5(b): Gram-Schmidt result')
print('Orthonormal basis Q (columns = q1, q2, q3):\n', Q5)

# (c) Orthonormality
header('Exercise 5(c): Orthonormality verification')
QtQ = Q5.T @ Q5
print('Q^T Q =\n', QtQ)
check_close('Q^T Q = I (orthonormal)', QtQ, np.eye(3))
for i in range(3):
    check_close(f'||q{i+1}|| = 1', np.linalg.norm(Q5[:, i]), 1.0)
for i in range(3):
    for j in range(i+1, 3):
        check_close(f'<q{i+1}, q{j+1}> = 0', np.dot(Q5[:, i], Q5[:, j]), 0.0)

# (d) Coordinates of w in orthonormal basis
header('Exercise 5(d): Coordinates of w = (1,2,3) in orthonormal basis')
coords5 = Q5.T @ w5   # coords[i] = <w, q_i>
print('Coordinates (= <w, q_i>):', coords5)
w5_reconstructed = Q5 @ coords5
check_close('Reconstruction Q @ coords = w', w5_reconstructed, w5)
print(f'w = {coords5[0]:.4f}*q1 + {coords5[1]:.4f}*q2 + {coords5[2]:.4f}*q3')

# (e) Projection onto span{v1, v2} = span{q1, q2}
header('Exercise 5(e): Projection onto span{v1, v2}')
Q2 = Q5[:, :2]           # first two orthonormal vectors
P5 = Q2 @ Q2.T           # projection matrix onto span{q1, q2}
print('Projection matrix P =\n', P5)
check_close('P^2 = P (idempotent)', P5 @ P5, P5)
check_close('P^T = P (symmetric)', P5.T, P5)
Pw5  = P5 @ w5
res5 = w5 - Pw5
print('P @ w =', Pw5)
print('residual r = w - P@w =', res5)
# Residual must be orthogonal to span{v1, v2}
check_close('r perp v1', np.dot(res5, v5_1), 0.0)
check_close('r perp v2', np.dot(res5, v5_2), 0.0)
check_close('r perp q1', np.dot(res5, Q5[:, 0]), 0.0)
check_close('r perp q2', np.dot(res5, Q5[:, 1]), 0.0)
print('\nConnection to QR: Q5 IS the Q factor; V5 = Q5 @ (Q5^T @ V5) = Q5 @ R.')
R_factor = Q5.T @ V5
check_close('V5 = Q5 @ R', Q5 @ R_factor, V5)
check_true('R is upper triangular', np.allclose(R_factor, np.triu(R_factor)))

print("Exercise 5 solution complete.")

Exercise 6: Subspace Operations — Sum, Intersection, and Direct Sum ***

Subspaces can be combined. The sum W1+W2W_1 + W_2 is the smallest subspace containing both; the intersection W1W2W_1 \cap W_2 is the largest subspace contained in both. A direct sum W1W2W_1 \oplus W_2 requires that their intersection is trivial.

Part A. In R3\mathbb{R}^3, let

W1=span{e1,e2}(the xy-plane),W2=span{(1,1,0),(0,0,1)}.W_1 = \operatorname{span}\{\mathbf{e}_1, \mathbf{e}_2\} \quad (\text{the } xy\text{-plane}), \qquad W_2 = \operatorname{span}\{(1,1,0)^\top,\,(0,0,1)^\top\}.

(a) Find a basis for W1+W2W_1 + W_2. Is W1+W2=R3W_1 + W_2 = \mathbb{R}^3?

(b) Find W1W2W_1 \cap W_2. Give a basis or show it equals {0}\{\mathbf{0}\}.

(c) Verify the dimension formula: dim(W1+W2)=dim(W1)+dim(W2)dim(W1W2)\dim(W_1 + W_2) = \dim(W_1) + \dim(W_2) - \dim(W_1 \cap W_2).

(d) Is R3=W1W2\mathbb{R}^3 = W_1 \oplus W_2? If not, find a subspace W3W_3 with R3=W1W3\mathbb{R}^3 = W_1 \oplus W_3.

Part B. In R4\mathbb{R}^4, let

W1={x:x1+x2=0},W2={x:x3+x4=0}.W_1 = \{\mathbf{x} : x_1 + x_2 = 0\}, \qquad W_2 = \{\mathbf{x} : x_3 + x_4 = 0\}.

(e) Find dim(W1)\dim(W_1), dim(W2)\dim(W_2), dim(W1W2)\dim(W_1 \cap W_2), and dim(W1+W2)\dim(W_1 + W_2). Verify the dimension formula.

Code cell 20

# Your Solution
# Exercise 6 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 6.")

Code cell 21

# Solution
# Exercise 6 - reference solution

e1 = np.array([1.0, 0.0, 0.0])
e2 = np.array([0.0, 1.0, 0.0])
e3 = np.array([0.0, 0.0, 1.0])
w6_a = np.array([1.0, 1.0, 0.0])
w6_b = np.array([0.0, 0.0, 1.0])
W1_basis = np.column_stack([e1, e2])
W2_basis = np.column_stack([w6_a, w6_b])

# (a) W1 + W2: pool all 4 generators; rank of the pool = dim(W1 + W2)
all_gens = np.column_stack([W1_basis, W2_basis])  # 3x4
R6, pivots6 = rref(all_gens)
sum_basis6 = all_gens[:, pivots6]
dim_sum6 = len(pivots6)

header('Exercise 6(a): W1 + W2')
print('Generators of W1+W2 (columns):\n', all_gens)
print('Pivot columns:', pivots6)
print('Basis for W1+W2 (pivot columns of generators):\n', sum_basis6)
check_true('W1 + W2 = R^3 (dim = 3)', dim_sum6 == 3)

# (b) W1 ∩ W2: solve alpha*e1 + beta*e2 = gamma*(1,1,0) + delta*(0,0,1)
# alpha = gamma, beta = gamma, 0 = delta => delta=0, gamma=any, alpha=beta=gamma
# Intersection = span{(1,1,0)} which is in W1 (it's a linear combo of e1,e2) AND in W2 (it's w6_a)
inter_vec6 = np.array([1.0, 1.0, 0.0])  # (1,1,0) = 1*e1 + 1*e2 = 1*w6_a + 0*w6_b

header('Exercise 6(b): W1 ∩ W2')
check_true('(1,1,0) in W1 (is linear combo of e1,e2)', np.allclose(1*e1 + 1*e2, inter_vec6))
check_true('(1,1,0) in W2 (is w6_a itself)', np.allclose(inter_vec6, w6_a))
inter_basis6 = inter_vec6.reshape(-1, 1)
dim_inter6 = 1
print(f'W1 ∩ W2 = span{{(1,1,0)}}, dim = {dim_inter6}')

# (c) Dimension formula
dim_W1 = W1_basis.shape[1]   # = 2
dim_W2 = W2_basis.shape[1]   # = 2
header('Exercise 6(c): Dimension formula')
print(f'dim(W1) = {dim_W1},  dim(W2) = {dim_W2},  dim(W1∩W2) = {dim_inter6},  dim(W1+W2) = {dim_sum6}')
check_true('dim formula: dim(W1+W2) = dim(W1)+dim(W2)-dim(W1∩W2)',
           dim_sum6 == dim_W1 + dim_W2 - dim_inter6)

# (d) Direct sum?  W1 ∩ W2 = span{(1,1,0)} ≠ {0}, so NOT a direct sum.
# Find W3 with R^3 = W1 ⊕ W3: take W3 = span{e3} = z-axis
W3_basis = e3.reshape(-1, 1)
header('Exercise 6(d): Direct sum?')
check_true('W1 ∩ W2 ≠ {0} => NOT a direct sum', dim_inter6 > 0)
print('W1 ⊕ W3 with W3 = span{e3}:')
W1_plus_W3 = np.column_stack([W1_basis, W3_basis])
check_true('W1 + W3 = R^3 (rank 3)', np.linalg.matrix_rank(W1_plus_W3) == 3)
# W1 ∩ W3: vectors in xy-plane AND on z-axis => must be zero
print('W1 ∩ W3 = {0} (xy-plane meets z-axis only at origin)')
print('=> R^3 = W1 ⊕ W3 = span{e1,e2} ⊕ span{e3}  (standard decomposition)')

# Part B
header('Exercise 6(e): Part B in R^4')
A6_w1 = np.array([[1.0, 1.0, 0.0, 0.0]])
A6_w2 = np.array([[0.0, 0.0, 1.0, 1.0]])
# W1 = null([1,1,0,0]): x1+x2=0, 2 free vars (x2,x3,x4) => dim 3
# W2 = null([0,0,1,1]): x3+x4=0, 2 free vars (x1,x2,x4) => dim 3
_, rank_w1 = nullspace_basis(A6_w1)
_, rank_w2 = nullspace_basis(A6_w2)
dim_w1_B = 4 - rank_w1   # = 3
dim_w2_B = 4 - rank_w2   # = 3
# W1 ∩ W2 = null([1,1,0,0; 0,0,1,1]): two constraints on R^4 => dim = 4-2 = 2
A6_both = np.vstack([A6_w1, A6_w2])
_, rank_both = nullspace_basis(A6_both)
# rank_both = rank of the combined constraint matrix
rank_constraint = np.linalg.matrix_rank(A6_both)
dim_inter_B = 4 - rank_constraint   # = 2
# W1 + W2: by dimension formula
dim_sum_B = dim_w1_B + dim_w2_B - dim_inter_B
print(f'dim(W1) = {dim_w1_B}')
print(f'dim(W2) = {dim_w2_B}')
print(f'dim(W1∩W2) = {dim_inter_B}')
print(f'dim(W1+W2) = {dim_sum_B} (by formula)')
# Verify: W1+W2 = R^4 iff dim = 4
check_true('Dimension formula holds', dim_sum_B == dim_w1_B + dim_w2_B - dim_inter_B)
check_true('W1 + W2 = R^4', dim_sum_B == 4)
print('Basis for W1∩W2: {(-1,1,0,0)^T restricted x3+x4=0, (-1,1,0,0),(0,0,-1,1)}? Verify:')
inter_b1 = np.array([-1.0, 1.0, 0.0,  0.0])  # x1+x2=0 and x3+x4=0
inter_b2 = np.array([ 0.0, 0.0,-1.0,  1.0])
check_close('inter_b1 in W1', A6_w1 @ inter_b1, np.zeros(1))
check_close('inter_b1 in W2', A6_w2 @ inter_b1, np.zeros(1))
check_close('inter_b2 in W1', A6_w1 @ inter_b2, np.zeros(1))
check_close('inter_b2 in W2', A6_w2 @ inter_b2, np.zeros(1))

print("Exercise 6 solution complete.")

Exercise 7: Change of Basis and Coordinates ***

A change of basis is a relabelling of the coordinate system. The same vector looks different in different bases; the same linear map has different matrices in different bases. Mastering this is prerequisite to understanding PCA, attention heads, and representation geometry.

In R2\mathbb{R}^2, let

B={b1=(1,2),  b2=(1,1)},C={c1=(2,1),  c2=(1,1)}.B = \{\mathbf{b}_1 = (1,2)^\top, \; \mathbf{b}_2 = (1,-1)^\top\}, \qquad C = \{\mathbf{c}_1 = (2,1)^\top, \; \mathbf{c}_2 = (-1,1)^\top\}.

(a) Verify that both BB and CC are bases for R2\mathbb{R}^2 (show that each is a linearly independent spanning set).

(b) Find the change-of-basis matrix PBCP_{B \to C} (its jj-th column is the coordinate vector of bj\mathbf{b}_j in basis CC).

(c) A vector has coordinates [v]B=(3,1)[\mathbf{v}]_B = (3, -1)^\top in basis BB. Find [v]C[\mathbf{v}]_C.

(d) Compute v\mathbf{v} explicitly (its standard-basis coordinates). Verify that your answer to (c) gives the correct representation.

(e) A linear map T:R2R2T : \mathbb{R}^2 \to \mathbb{R}^2 has matrix

Mstd=(2103)M_{\text{std}} = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix}

in the standard basis. Find its matrix MBM_B in basis BB. Verify that MstdM_{\text{std}} and MBM_B have the same eigenvalues (they are similar matrices).

Code cell 23

# Your Solution
# Exercise 7 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 7.")

Code cell 24

# Solution
# Exercise 7 - reference solution

b1 = np.array([1.0,  2.0])
b2 = np.array([1.0, -1.0])
c1 = np.array([2.0,  1.0])
c2 = np.array([-1.0, 1.0])
P_B = np.column_stack([b1, b2])
P_C = np.column_stack([c1, c2])

# (a) Both bases: non-zero determinant <=> invertible <=> spans R^2 and is independent
det_B = np.linalg.det(P_B)
det_C = np.linalg.det(P_C)
header('Exercise 7(a): Verify bases')
print(f'det(P_B) = {det_B:.4f}')
print(f'det(P_C) = {det_C:.4f}')
check_true('B is a basis (det != 0)', abs(det_B) > 1e-10)
check_true('C is a basis (det != 0)', abs(det_C) > 1e-10)

# (b) P_{B->C}: j-th column = coords of b_j in basis C
# b_j (standard) = P_C @ [b_j]_C  =>  [b_j]_C = P_C^{-1} @ b_j
# P_{B->C} = P_C^{-1} @ P_B
P_BtoC = np.linalg.inv(P_C) @ P_B
header('Exercise 7(b): Change-of-basis matrix P_{B->C}')
print('P_{B->C} =\n', P_BtoC)
# Verify: P_C @ P_{B->C} @ [v]_B = v (standard), which means P_{B->C} correctly translates
# Check: P_C @ P_{B->C} = P_B
check_close('P_C @ P_{B->C} = P_B (consistency check)', P_C @ P_BtoC, P_B)

# (c) [v]_C = P_{B->C} @ [v]_B
v_in_B = np.array([3.0, -1.0])
v_in_C = P_BtoC @ v_in_B
header('Exercise 7(c): Coordinates of v in basis C')
print(f'[v]_B = {v_in_B}')
print(f'[v]_C = P_{{B->C}} @ [v]_B = {v_in_C}')

# (d) v in standard coords: v = P_B @ [v]_B
v_std = P_B @ v_in_B
header('Exercise 7(d): v in standard coordinates')
print(f'v (standard) = P_B @ [v]_B = {v_std}')
# Verify: P_C @ [v]_C should also give v_std
check_close('P_C @ [v]_C = v_std', P_C @ v_in_C, v_std)
# Verify [v]_C directly: inverse of C applied to v_std
check_close('[v]_C = P_C^{-1} @ v_std', np.linalg.inv(P_C) @ v_std, v_in_C)

# (e) Matrix of T in basis B: similarity transformation
# M_B = P_B^{-1} @ M_std @ P_B
M_std = np.array([[2.0, 1.0], [0.0, 3.0]])
P_B_inv = np.linalg.inv(P_B)
M_B = P_B_inv @ M_std @ P_B
header('Exercise 7(e): T in basis B')
print('M_std =\n', M_std)
print('M_B = P_B^{-1} M_std P_B =\n', M_B)
eigvals_std = np.sort(np.linalg.eigvals(M_std).real)
eigvals_B   = np.sort(np.linalg.eigvals(M_B).real)
print(f'Eigenvalues of M_std: {eigvals_std}')
print(f'Eigenvalues of M_B:   {eigvals_B}')
check_close('Similar matrices have the same eigenvalues', eigvals_std, eigvals_B)
check_close('M_std and M_B have the same trace', np.trace(M_std), np.trace(M_B))
check_close('M_std and M_B have the same determinant', np.linalg.det(M_std), np.linalg.det(M_B))
print('\nKey insight: The linear map T is the same object regardless of basis.')
print('Only its matrix representation changes; eigenvalues and determinant are intrinsic.')

print("Exercise 7 solution complete.")

Exercise 8: AI Application — Subspace Analysis of a Linear Layer ***

Every linear layer WRm×nW \in \mathbb{R}^{m \times n} induces four fundamental subspaces. Understanding these subspaces tells you exactly what information a layer can and cannot transmit. LoRA exploits this structure by restricting weight updates to a low-dimensional subspace of Rm×n\mathbb{R}^{m \times n}.

Let

W=(312121213).W = \begin{pmatrix} 3 & 1 & 2 \\ 1 & 2 & 1 \\ 2 & 1 & 3 \end{pmatrix}.

(a) Compute the SVD W=UΣVW = U \Sigma V^\top. What is rank(W)\operatorname{rank}(W)? Report the singular values.

(b) Identify col(W)\operatorname{col}(W) — the reachable output subspace. What is its dimension? If a downstream vector b=(1,0,0)\mathbf{b} = (1, 0, 0)^\top is the target, is it reachable? What about b=(1,1,0)\mathbf{b}' = (1, -1, 0)^\top? (Verify computationally.)

(c) Identify null(W)\operatorname{null}(W) — the input directions the layer ignores. What is its dimension? If the null space is non-trivial, give an explicit null vector and verify Wn=0W\mathbf{n} = \mathbf{0}. If the null space is trivial, explain why.

(d) LoRA adapter: let ΔW=ba\Delta W = \mathbf{b}\mathbf{a}^\top with b=(1,0,1)\mathbf{b} = (1, 0, 1)^\top and a=(1,1,0)\mathbf{a} = (1, 1, 0)^\top. What is col(ΔW)\operatorname{col}(\Delta W)? What is rank(ΔW)\operatorname{rank}(\Delta W)? Verify that ΔW=ba\Delta W = \mathbf{b}\mathbf{a}^\top numerically.

(e) Updated weight W=W+ΔWW' = W + \Delta W. Without computing the full matrix, what is the upper bound on rank(W)\operatorname{rank}(W')? Under what condition would rank(W)=rank(W)\operatorname{rank}(W') = \operatorname{rank}(W)? Under what condition would rank(W)=rank(W)+1\operatorname{rank}(W') = \operatorname{rank}(W) + 1? Verify your answer by computing rank(W)\operatorname{rank}(W') directly.

Code cell 26

# Your Solution
# Exercise 8 - learner workspace
# Write your solution here, then run the reference solution below to compare.
print("Learner workspace ready for Exercise 8.")

Code cell 27

# Solution
# Exercise 8 - reference solution

W8 = np.array([
    [3.0, 1.0, 2.0],
    [1.0, 2.0, 1.0],
    [2.0, 1.0, 3.0]
])

# (a) SVD and rank
U8, s8, Vt8 = np.linalg.svd(W8)
rank8 = np.linalg.matrix_rank(W8)

header('Exercise 8(a): SVD and rank')
print(f'rank(W) = {rank8}')
print(f'Singular values: {s8}')
print(f'U (left singular vectors):\n{U8}')
print(f'V^T (right singular vectors):\n{Vt8}')
check_close('SVD reconstruction: U @ diag(s) @ Vt = W', U8 @ np.diag(s8) @ Vt8, W8)

# (b) Column space
b_target  = np.array([1.0,  0.0, 0.0])
b_target2 = np.array([1.0, -1.0, 0.0])

def in_col_space(A, b, tol=1e-8):
    """True if b is in the column space of A (check via rank augmentation)."""
    r_A = np.linalg.matrix_rank(A)
    r_aug = np.linalg.matrix_rank(np.column_stack([A, b]))
    return r_A == r_aug

header('Exercise 8(b): Column space and reachability')
print(f'col(W) has dimension {rank8} (= rank(W))')
print(f'Basis for col(W): left singular vectors u1,...,u_{rank8} of W')
col_basis8 = U8[:, :rank8]
print('col(W) basis (columns):\n', col_basis8)

b1_reachable = in_col_space(W8, b_target)
b2_reachable = in_col_space(W8, b_target2)
print(f'b = {b_target}: reachable = {b1_reachable}')
print(f'b\' = {b_target2}: reachable = {b2_reachable}')
check_true('b = (1,0,0) is in col(W)', b1_reachable)
check_true('b\' = (1,-1,0) is in col(W)', b2_reachable)
print('Since rank(W)=3=m, col(W)=R^3: every b is reachable. W has full column rank => col(W)=R^3.')

# (c) Null space
null8, rank8_check = nullspace_basis(W8)
nullity8 = null8.shape[1] if null8.size > 0 else 0

header('Exercise 8(c): Null space')
print(f'rank(W) = {rank8_check},  nullity = n - rank = {W8.shape[1] - rank8_check}')
print(f'null(W) has dimension {nullity8}')
if nullity8 == 0:
    print('null(W) = {0}: W is injective (full column rank).')
    print('Every distinct input produces a distinct output. No information is lost.')
    print('rank(W) = n = 3 => W is injective => null(W) is trivial.')
else:
    print('null(W) basis:\n', null8)
    check_close('W @ null_vec = 0', W8 @ null8[:, 0], np.zeros(W8.shape[0]))

# (d) LoRA adapter
b8 = np.array([1.0, 0.0, 1.0])
a8 = np.array([1.0, 1.0, 0.0])
DeltaW8 = np.outer(b8, a8)  # b @ a^T (outer product)
rank_DW8 = np.linalg.matrix_rank(DeltaW8)

header('Exercise 8(d): LoRA adapter DeltaW = b @ a^T')
print('b =', b8)
print('a =', a8)
print('DeltaW = b @ a^T =\n', DeltaW8)
print(f'rank(DeltaW) = {rank_DW8}')
check_true('DeltaW has rank 1 (outer product of nonzero vectors)', rank_DW8 == 1)
# col(DeltaW) = span{b}
# Any vector in col(DeltaW) is of the form DeltaW @ x = (a^T x) b, which is a scalar multiple of b
x_test = np.array([2.0, -1.0, 3.0])
Dw_x = DeltaW8 @ x_test
scalar = np.dot(a8, x_test)
check_close('DeltaW @ x = (a^T x) * b', Dw_x, scalar * b8)
print(f'col(DeltaW) = span{{b}} = span{{(1,0,1)}}; dimension = 1')
print(f'null(DeltaW) = null(a^T): any x with a^T x = 0; dimension = 2')

# (e) Rank of W' = W + DeltaW
Wprime8 = W8 + DeltaW8
rank_Wprime8 = np.linalg.matrix_rank(Wprime8)

header('Exercise 8(e): rank(W\' = W + DeltaW)')
print(f'rank(W) = {rank8}')
print(f'rank(DeltaW) = {rank_DW8}')
print(f'rank(W\') = {rank_Wprime8}')
print(f'Upper bound: rank(W + DeltaW) <= rank(W) + rank(DeltaW) = {rank8 + rank_DW8}')
check_true('rank(W\') <= rank(W) + rank(DeltaW)', rank_Wprime8 <= rank8 + rank_DW8)
print()
print('Interpretation:')
print(f'  rank(W\') = rank(W) = {rank8}: the LoRA update does NOT increase rank.')
print('  (W already has full rank 3 = m = n; rank cannot exceed 3 in R^{3x3}.)')
print()
print('General rule for LoRA (when rank(W) < n):')
print('  rank(W\') = rank(W):     if col(DeltaW) = span{b} is already inside col(W)')
print('  rank(W\') = rank(W) + 1: if b is NOT in col(W) (LoRA adds a genuinely new direction)')
print('  Here rank(W)=3=m, so col(W)=R^3 contains everything; the adapter cannot increase rank.')

print()
print('AI Takeaway:')
print('  col(W) = the reachable output subspace; rank-deficient W has dead output dimensions.')
print('  null(W) = the ignored input subspace; inputs in null(W) are invisible to this layer.')
print('  LoRA constrains updates to a rank-r subspace of R^{mxn}; the LoRA rank r is the')
print('  number of new directions the adapter can contribute to the column space of W.')

print("Exercise 8 solution complete.")

Exercise 9 (★★★): Tangent Subspace of the Probability Simplex

The probability simplex constraint ipi=1\sum_i p_i=1 has tangent subspace

T={vRn:1v=0}.T=\{v\in\mathbb{R}^n: \mathbf{1}^\top v=0\}.

Find a basis for this tangent subspace and verify every basis vector sums to zero.

Code cell 29

# Your Solution
# Exercise 9 - learner workspace
# Build a basis for vectors whose coordinates sum to zero.
print("Learner workspace ready for Exercise 9.")

Code cell 30

# Solution
# Exercise 9 - simplex tangent subspace
header("Exercise 9: simplex tangent subspace")

n = 4
B = np.vstack([np.eye(n-1), -np.ones(n-1)])
print("basis columns:\n", B)
check_close("each basis vector sums to zero", np.ones(n) @ B, np.zeros(n-1), tol=1e-12)
check_true("dimension is n-1", la.matrix_rank(B) == n-1)
print("Takeaway: constrained probability updates live in a lower-dimensional linear subspace.")

Exercise 10 (★★★): Affine Solution Set of an Underdetermined System

For Ax=bAx=b with infinitely many solutions, write

x=x0+z,znull(A).x=x_0+z, \qquad z\in\operatorname{null}(A).

Find one solution, a nullspace direction, and verify several affine solutions.

Code cell 32

# Your Solution
# Exercise 10 - learner workspace
# Find a particular solution plus nullspace direction.
print("Learner workspace ready for Exercise 10.")

Code cell 33

# Solution
# Exercise 10 - affine solution set
header("Exercise 10: affine solution set")

A = np.array([[1.0, 2.0, -1.0], [0.0, 1.0, 1.0]])
b = np.array([1.0, 2.0])
x0 = la.lstsq(A, b, rcond=None)[0]
U, S, Vt = la.svd(A)
z = Vt[-1]
print("particular solution:", x0)
print("null direction:", z)
check_close("A x0 = b", A @ x0, b, tol=1e-10)
check_close("A z = 0", A @ z, np.zeros(A.shape[0]), tol=1e-10)
for t in [-2.0, 0.0, 3.5]:
    check_close(f"solution at t={t}", A @ (x0 + t*z), b, tol=1e-10)
print("Takeaway: underdetermined systems are affine spaces: one solution plus all null directions.")

What to Review After Finishing

Axioms and structure

  • Can you identify which axiom fails in a non-example, and produce a concrete counterexample?
  • Do you know why the three-condition test (zero, closure under ++, closure under scalar mult) is sufficient to check the full eight axioms?
  • Can you recognise affine subspaces — sets that look like subspaces but are offset from the origin?

Span, independence, and basis

  • Can you compute the dimension of a span via row reduction?
  • Can you extract a basis from a spanning set by identifying pivot columns?
  • Can you express dependent vectors as linear combinations of the basis and verify the reconstruction?

Four fundamental subspaces

  • Can you compute all four subspaces (col, null, row, left null) of a matrix from RREF?
  • Can you verify the orthogonality row(A)null(A)\operatorname{row}(A) \perp \operatorname{null}(A) and col(A)null(A)\operatorname{col}(A) \perp \operatorname{null}(A^\top)?
  • Do you understand Strang's picture: AA is a bijection from row(A)\operatorname{row}(A) to col(A)\operatorname{col}(A), and AA kills null(A)\operatorname{null}(A)?

Gram-Schmidt and projections

  • Can you execute Gram-Schmidt step by step and verify orthonormality?
  • Can you construct a projection matrix P=QQP = QQ^\top and verify P2=PP^2 = P and P=PP^\top = P?
  • Can you explain why the residual wPw\mathbf{w} - P\mathbf{w} is orthogonal to the subspace?

Subspace operations and basis change

  • Can you compute dim(W1+W2)\dim(W_1 + W_2) and dim(W1W2)\dim(W_1 \cap W_2) using the dimension formula?
  • Do you know when V=W1W2V = W_1 \oplus W_2 (direct sum) vs when it fails?
  • Can you find the change-of-basis matrix PBCP_{B \to C} and convert coordinates between bases?
  • Can you find the matrix of a linear map in a new basis via the similarity transform MB=PB1MstdPBM_B = P_B^{-1} M_{\text{std}} P_B?

AI connections

  • Can you interpret col(W)\operatorname{col}(W), null(W)\operatorname{null}(W), row(W)\operatorname{row}(W), null(W)\operatorname{null}(W^\top) for a neural network weight matrix?
  • Can you explain LoRA in terms of restricting the weight update to a rank-rr subspace of Rm×n\mathbb{R}^{m \times n}?
  • Do you understand why the rank of a LoRA update ΔW=ba\Delta W = \mathbf{b}\mathbf{a}^\top is at most rr and when it is strictly less?

References

PreviousNext