Private notes
0/8000

Notes stay private to your browser until account sync is configured.

Part 4
15 min read3 headingsSplit lesson page

Lesson overview | Previous part | Lesson overview

Vector Spaces and Subspaces: Part 15: Exercises to Conceptual Bridge

15. Exercises

Exercise 1: Verifying Vector Space Axioms

For each of the following, determine whether it is a vector space with the given operations. If not, identify which axiom(s) fail. For each valid vector space, identify the zero vector and describe the additive inverses.

(a) V=R2V = \mathbb{R}^2, addition: (u1,u2)+(v1,v2)=(u1+v1, u2+v2)(u_1, u_2) + (v_1, v_2) = (u_1 + v_1,\ u_2 + v_2), scalar mult: α(u1,u2)=(αu1, u2)\alpha(u_1, u_2) = (\alpha u_1,\ u_2) - scalar multiplication only affects the first coordinate.

(b) V=R+={xR:x>0}V = \mathbb{R}_{+} = \{x \in \mathbb{R} : x > 0\}, addition defined as uv=uvu \oplus v = uv (multiplication of positive reals), scalar multiplication αu=uα\alpha \odot u = u^\alpha.

(c) V={(x,y,z)R3:x+y+z=1}V = \{(x, y, z) \in \mathbb{R}^3 : x + y + z = 1\} with the standard addition and scalar multiplication inherited from R3\mathbb{R}^3.

(d) V={f:RR:f(0)=0}V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 0\} with standard pointwise addition (f+g)(t)=f(t)+g(t)(f+g)(t) = f(t)+g(t) and scalar multiplication (αf)(t)=αf(t)(\alpha f)(t) = \alpha f(t).

(e) V={f:RR:f(0)=1}V = \{f : \mathbb{R} \to \mathbb{R} : f(0) = 1\} with the same standard operations.

Hints for (b): Verify all eight axioms carefully with the non-standard operations. The zero element of the vector space (if it exists) is the identity for \oplus, not the number 0. For (d) and (e), think about which evaluation constraint is compatible with the zero function.


Exercise 2: Subspace Verification

For each subset, determine whether it is a subspace of the given vector space. Prove it is a subspace (using the three-condition test) or find an explicit counterexample showing it is not.

(a) W={(x,y,z)R3:2xy+3z=0}W = \{(x, y, z) \in \mathbb{R}^3 : 2x - y + 3z = 0\} inside R3\mathbb{R}^3

(b) W={(x,y)R2:xy=0}W = \{(x, y) \in \mathbb{R}^2 : xy = 0\} (the two coordinate axes together) inside R2\mathbb{R}^2

(c) W={AR2×2:tr(A)=0}W = \{A \in \mathbb{R}^{2 \times 2} : \text{tr}(A) = 0\} (traceless 2×22 \times 2 matrices) inside R2×2\mathbb{R}^{2 \times 2}

(d) W={pP3:p(1)=0}W = \{p \in \mathcal{P}_3 : p(1) = 0\} (polynomials of degree 3\leq 3 that vanish at t=1t = 1) inside P3\mathcal{P}_3

(e) W={(x,y,z)R3:x2+y2=z2}W = \{(x, y, z) \in \mathbb{R}^3 : x^2 + y^2 = z^2\} (double cone) inside R3\mathbb{R}^3

For each valid subspace: find a basis and state the dimension.


Exercise 3: The Four Fundamental Subspaces

Let A=(120100131214)A = \begin{pmatrix} 1 & 2 & 0 & 1 \\ 0 & 0 & 1 & 3 \\ 1 & 2 & 1 & 4 \end{pmatrix}.

(a) Row-reduce AA to RREF. Identify the pivot columns and free columns.

(b) Find a basis for null(A)\text{null}(A). Verify the dimension via the Rank-Nullity Theorem.

(c) Find a basis for col(A)\text{col}(A) using the pivot columns of the original AA (not the RREF). State the dimension.

(d) Find a basis for row(A)\text{row}(A) using the non-zero rows of the RREF. State the dimension.

(e) Find a basis for null(A)\text{null}(A^\top) by row-reducing AA^\top and finding its null space. Alternatively, argue from the dimension formula.

(f) Verify the dimension counts: dim(col(A))+dim(null(A))=3\dim(\text{col}(A)) + \dim(\text{null}(A^\top)) = 3 and dim(row(A))+dim(null(A))=4\dim(\text{row}(A)) + \dim(\text{null}(A)) = 4.

(g) Verify the orthogonality row(A)null(A)\text{row}(A) \perp \text{null}(A): compute the dot product of every basis vector of row(A)\text{row}(A) with every basis vector of null(A)\text{null}(A) and confirm it equals zero.


Exercise 4: Span, Independence, and Basis in R4\mathbb{R}^4

Let v1=(1,2,0,1)\mathbf{v}_1 = (1,2,0,1)^\top, v2=(0,1,1,2)\mathbf{v}_2 = (0,1,1,2)^\top, v3=(1,3,1,3)\mathbf{v}_3 = (1,3,1,3)^\top, v4=(2,1,1,1)\mathbf{v}_4 = (2,1,-1,-1)^\top.

(a) Form the matrix A=[v1v2v3v4]A = [\mathbf{v}_1 \mid \mathbf{v}_2 \mid \mathbf{v}_3 \mid \mathbf{v}_4] and row-reduce. Are {v1,v2,v3,v4}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\} linearly independent?

(b) What is dim(span{v1,v2,v3,v4})\dim(\text{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\})?

(c) Identify a subset {vi1,vi2,}\{\mathbf{v}_{i_1}, \mathbf{v}_{i_2}, \ldots\} of the given vectors that forms a basis for span{v1,v2,v3,v4}\text{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}.

(d) Express any dependent vectors as explicit linear combinations of your basis vectors.

(e) Is w=(3,5,1,4)\mathbf{w} = (3, 5, 1, 4)^\top in span{v1,v2,v3,v4}\text{span}\{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4\}? If yes, find coordinates (α1,α2,)(\alpha_1, \alpha_2, \ldots) such that w=α1vi1+α2vi2+\mathbf{w} = \alpha_1 \mathbf{v}_{i_1} + \alpha_2 \mathbf{v}_{i_2} + \cdots. If no, explain why.


Exercise 5: Gram-Schmidt and Orthogonal Projection

Work in R3\mathbb{R}^3 with the standard inner product.

(a) Starting from v1=(1,1,0)\mathbf{v}_1 = (1,1,0)^\top, v2=(1,0,1)\mathbf{v}_2 = (1,0,1)^\top, v3=(0,1,1)\mathbf{v}_3 = (0,1,1)^\top, verify that these three vectors are linearly independent (compute the determinant of the matrix they form).

(b) Apply Gram-Schmidt to produce an orthonormal basis {q1,q2,q3}\{\mathbf{q}_1, \mathbf{q}_2, \mathbf{q}_3\} for R3\mathbb{R}^3.

(c) Verify your result: check that qi=1\|\mathbf{q}_i\| = 1 for each ii and qi,qj=0\langle \mathbf{q}_i, \mathbf{q}_j \rangle = 0 for iji \neq j.

(d) Express w=(1,2,3)\mathbf{w} = (1, 2, 3)^\top in your orthonormal basis: find αi=w,qi\alpha_i = \langle \mathbf{w}, \mathbf{q}_i \rangle for each ii. Verify that w=iαiqi\mathbf{w} = \sum_i \alpha_i \mathbf{q}_i.

(e) Construct the orthogonal projection matrix PP onto W=span{v1,v2}=span{q1,q2}W = \text{span}\{\mathbf{v}_1, \mathbf{v}_2\} = \text{span}\{\mathbf{q}_1, \mathbf{q}_2\} using P=Q2Q2P = Q_2 Q_2^\top where Q2=[q1q2]Q_2 = [\mathbf{q}_1 \mid \mathbf{q}_2]. Verify: (i) P2=PP^2 = P, (ii) P=PP^\top = P, (iii) rank(P)=2\text{rank}(P) = 2. Compute PwP\mathbf{w} and the residual r=wPw\mathbf{r} = \mathbf{w} - P\mathbf{w}. Verify that rv1\mathbf{r} \perp \mathbf{v}_1 and rv2\mathbf{r} \perp \mathbf{v}_2.


Exercise 6: Subspace Operations and Dimension Formula

(a) In R3\mathbb{R}^3, let W1=span{(1,0,0),(0,1,0)}W_1 = \text{span}\{(1,0,0)^\top, (0,1,0)^\top\} (the xyxy-plane) and W2=span{(1,1,0),(0,0,1)}W_2 = \text{span}\{(1,1,0)^\top, (0,0,1)^\top\}. Find a basis for W1+W2W_1 + W_2. Is W1+W2=R3W_1 + W_2 = \mathbb{R}^3? Find W1W2W_1 \cap W_2 and verify the Grassmann formula: dim(W1+W2)=dim(W1)+dim(W2)dim(W1W2)\dim(W_1 + W_2) = \dim(W_1) + \dim(W_2) - \dim(W_1 \cap W_2).

(b) Is R3=W1W2\mathbb{R}^3 = W_1 \oplus W_2 (direct sum)? If not, find a subspace W3R3W_3 \subseteq \mathbb{R}^3 such that R3=W1W3\mathbb{R}^3 = W_1 \oplus W_3.

(c) In R4\mathbb{R}^4, let W1={x:x1+x2=0}W_1 = \{\mathbf{x} : x_1 + x_2 = 0\} and W2={x:x3+x4=0}W_2 = \{\mathbf{x} : x_3 + x_4 = 0\}. Find dim(W1)\dim(W_1), dim(W2)\dim(W_2), dim(W1W2)\dim(W_1 \cap W_2), and dim(W1+W2)\dim(W_1 + W_2). Verify the Grassmann formula.

(d) For W1W2W_1 \cap W_2 from part (c): find an explicit basis.


Exercise 7: Change of Basis and Coordinates

In R2\mathbb{R}^2, let B={b1,b2}\mathcal{B} = \{\mathbf{b}_1, \mathbf{b}_2\} with b1=(1,2)\mathbf{b}_1 = (1,2)^\top and b2=(1,1)\mathbf{b}_2 = (1,-1)^\top, and let C={c1,c2}\mathcal{C} = \{\mathbf{c}_1, \mathbf{c}_2\} with c1=(2,1)\mathbf{c}_1 = (2,1)^\top and c2=(1,1)\mathbf{c}_2 = (-1,1)^\top.

(a) Verify that both B\mathcal{B} and C\mathcal{C} are bases for R2\mathbb{R}^2 (compute the determinants of the corresponding matrices).

(b) Find the change-of-basis matrix PBCP_{\mathcal{B} \to \mathcal{C}}: express each bi\mathbf{b}_i as a linear combination of c1,c2\mathbf{c}_1, \mathbf{c}_2, and use these as the columns of PP.

(c) For the vector v\mathbf{v} with [v]B=(3,1)[\mathbf{v}]_{\mathcal{B}} = (3, -1)^\top (coordinates 3b1b23\mathbf{b}_1 - \mathbf{b}_2 in basis B\mathcal{B}), compute [v]C=PBC[v]B[\mathbf{v}]_{\mathcal{C}} = P_{\mathcal{B} \to \mathcal{C}} [\mathbf{v}]_{\mathcal{B}}.

(d) Compute v\mathbf{v} explicitly in the standard basis. Then verify your answer to (c) by directly expressing v\mathbf{v} as a linear combination of c1\mathbf{c}_1 and c2\mathbf{c}_2.

(e) If a linear map T:R2R2T: \mathbb{R}^2 \to \mathbb{R}^2 has matrix M=(2103)M = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix} in the standard basis, find its matrix representation in basis B\mathcal{B}.


Exercise 8: AI Application - Subspace Analysis of a Weight Matrix

Let W=(312121213)W = \begin{pmatrix} 3 & 1 & 2 \\ 1 & 2 & 1 \\ 2 & 1 & 3 \end{pmatrix}.

(a) Compute det(W)\det(W) and rank(W)\text{rank}(W). Is WW full rank?

(b) Find a basis for null(W)\text{null}(W) and col(W)\text{col}(W). Verify dim(null(W))+dim(col(W))=3\dim(\text{null}(W)) + \dim(\text{col}(W)) = 3.

(c) A LoRA adapter has rank r=1r = 1 with update ΔW=ba\Delta W = \mathbf{b}\mathbf{a}^\top where b=(1,0,1)\mathbf{b} = (1, 0, 1)^\top and a=(1,1,0)\mathbf{a} = (1, 1, 0)^\top. Explicitly compute ΔW\Delta W. What is col(ΔW)\text{col}(\Delta W)? What is rank(ΔW)\text{rank}(\Delta W)?

(d) Consider the updated weight W=W+ΔWW' = W + \Delta W. Without computing WW' explicitly, state the upper and lower bounds on rank(W)\text{rank}(W'). Under what condition on the relationship between col(ΔW)\text{col}(\Delta W) and null(W)\text{null}(W^\top) (i.e., the left null space of WW) would rank(W)=rank(W)+1\text{rank}(W') = \text{rank}(W) + 1? Under what condition would rank(W)=rank(W)\text{rank}(W') = \text{rank}(W)?

(e) Now compute WW' explicitly. Verify your predictions from (d) by computing rank(W)\text{rank}(W').


Exercise 9 (Challenge): The Superposition Geometry

This exercise develops the geometry of the superposition hypothesis.

(a) In R2\mathbb{R}^2 (a 2-dimensional embedding space), suppose we want to store F=3F = 3 features as unit vectors with maximum pairwise orthogonality. Show that it is impossible for all three feature vectors to be pairwise orthogonal. (Hint: three pairwise orthogonal unit vectors in R2\mathbb{R}^2 cannot exist.)

(b) Instead, place three unit vectors at angles 00, 120120, 240240 from the positive xx-axis: f1=(1,0)\mathbf{f}_1 = (1,0)^\top, f2=(1/2,3/2)\mathbf{f}_2 = (-1/2, \sqrt{3}/2)^\top, f3=(1/2,3/2)\mathbf{f}_3 = (-1/2, -\sqrt{3}/2)^\top. Compute all pairwise inner products fi,fj\langle \mathbf{f}_i, \mathbf{f}_j \rangle for iji \neq j. What is the interference level?

(c) The "reconstruction loss" for feature ii when all features have activation xi{0,1}x_i \in \{0,1\} is: the residual after reading off the ii-th feature and subtracting the contribution of the feature direction. Specifically, if the residual stream stores h=jxjfj\mathbf{h} = \sum_j x_j \mathbf{f}_j, and we estimate x^i=h,fi\hat{x}_i = \langle \mathbf{h}, \mathbf{f}_i \rangle, show that x^i=xi+jixjfj,fi\hat{x}_i = x_i + \sum_{j \neq i} x_j \langle \mathbf{f}_j, \mathbf{f}_i \rangle. The error x^ixi\hat{x}_i - x_i is the interference from other features.

(d) Compute the expected interference for the configuration in (b) assuming each xj{0,1}x_j \in \{0, 1\} independently with probability pp of being active. What does the interference approach as p0p \to 0?


Exercise 10 (Challenge): Krylov Subspaces

Let A=(2103)A = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix} and b=(1,1)\mathbf{b} = (1, 1)^\top.

(a) Compute K1(A,b)=span{b}\mathcal{K}_1(A, \mathbf{b}) = \text{span}\{\mathbf{b}\} and K2(A,b)=span{b,Ab}\mathcal{K}_2(A, \mathbf{b}) = \text{span}\{\mathbf{b}, A\mathbf{b}\}.

(b) Does K2(A,b)=R2\mathcal{K}_2(A, \mathbf{b}) = \mathbb{R}^2? (Check whether b\mathbf{b} and AbA\mathbf{b} are linearly independent.)

(c) Apply one step of the Lanczos algorithm to produce an orthonormal basis for K2(A,b)\mathcal{K}_2(A, \mathbf{b}) (Gram-Schmidt applied to {b,Ab}\{\mathbf{b}, A\mathbf{b}\}).

(d) In this orthonormal basis {q1,q2}\{\mathbf{q}_1, \mathbf{q}_2\}, compute the tridiagonal matrix T=QAQT = Q^\top A Q where Q=[q1q2]Q = [\mathbf{q}_1 \mid \mathbf{q}_2]. Verify that TT is symmetric and (nearly) tridiagonal. The eigenvalues of TT approximate the eigenvalues of AA - verify this by comparing with the actual eigenvalues of AA.


16. Why This Matters for AI (2026 Perspective)

AspectImpact
Residual stream as shared vector spaceThe residual stream xRd\mathbf{x} \in \mathbb{R}^d in a transformer is the central shared vector space. Every attention head, MLP layer, and positional encoding adds vectors to this space via the residual connection. All components communicate through this one dd-dimensional space - nothing else is shared. Understanding the subspace decomposition of Rd\mathbb{R}^d (which subspaces are written by which components, which are read by which) is literally the same as understanding how information flows through a transformer. There is no higher-level description.
LoRA rank = subspace dimensionLoRA's entire design is a subspace constraint. The update ΔW=BA\Delta W = BA^\top is a rank-rr matrix, living in an r(m+n)/(mn)r(m+n)/(mn)-fraction of the full parameter space. Choosing rr is choosing the dimension of the subspace to search in. Too small: the subspace doesn't contain the optimal update direction, and performance suffers. Too large: you are searching a subspace larger than necessary, wasting parameters and potentially overfitting. The right rr is the intrinsic dimension of the fine-tuning task - a subspace dimension.
Superposition and polysemanticityThe superposition hypothesis says LLMs represent more features than their embedding dimension allows. Since more than dd linearly independent directions cannot exist in Rd\mathbb{R}^d, features beyond dd must share dimensions through non-orthogonal superposition. Each neuron becomes polysemantic - it responds to multiple features. This limits interpretability (you cannot read off features from individual dimensions) and causes interference between features. Solving superposition is one of the central unsolved problems in mechanistic interpretability, and the solution must be phrased in the language of subspace geometry.
MLA and KV subspace compressionDeepSeek-V3's Multi-head Latent Attention compresses KV projections to a rank-rr subspace with rdr \ll d. The KV cache stores only the rr-dimensional compressed representation; at inference, it is decompressed back to Rd\mathbb{R}^d. The rank rr is the architectural bottleneck that enables a 5.75×5.75\times reduction in KV cache memory. The subspace dimension rr is the design variable that trades memory against expressiveness. This is subspace thinking at the architecture level: the architecture is designed around a low-dimensional subspace constraint.
Mechanistic interpretability circuitsEvery circuit found in mechanistic interpretability is a composition of subspace operations. A "previous token head" reads from a specific subspace of the residual stream (via its WQW_Q, WKW_K row spaces) and writes to a specific subspace (via its WOW_O column space). An "induction head" communicates with the previous token head through a shared subspace. Superposition of features happens in specific subspaces. The "language" of mechanistic interpretability is entirely the language of subspaces: read, write, rotate, project, in Rd\mathbb{R}^d.
Representation collapse preventionIn self-supervised learning, representation collapse means all embeddings converge to a low-dimensional (or 0-dimensional) subspace - the network learns to output the same or similar vectors for all inputs. VICReg, Barlow Twins, and BYOL all prevent collapse by adding losses that penalise low-dimensional representations. VICReg's variance loss requires that the covariance matrix of batch embeddings has high trace (high average variance), which prevents the embeddings from collapsing to a subspace. Collapse = subspace dimension 0\to 0; the loss explicitly maximises subspace dimension.
Implicit bias and minimum-norm solutionsGradient descent on overparameterised linear models (and, empirically, on large neural networks) converges to minimum-norm solutions. The minimum-norm solution lies in the row space of the data matrix - it is the unique solution in the subspace row(X)\text{row}(X), the orthogonal complement of the null space. The implicit bias of gradient descent is a subspace selection: it selects the solution in row(X)\text{row}(X) rather than any other coset representative. This subspace selection is what enables generalisation in overparameterised models.
Orthogonality and head diversityIf attention heads write to mutually orthogonal subspaces of the residual stream, they do not interfere with each other - each head has an independent information channel. Head redundancy is equivalent to subspace overlap: if head AA's column space is a subspace of head BB's column space, head AA is redundant. Pruning heads that write to subspaces already covered by remaining heads preserves expressiveness while reducing computation. Orthogonality between head subspaces is the precise mathematical criterion for head independence.
Function space universalityThe universal approximation theorem says that neural networks with nonlinear activations can approximate any continuous function to arbitrary accuracy. In subspace language: the set of functions representable by a sufficiently large network is dense in C([0,1]n)C([0,1]^n) - it is not a subspace (the network family is non-linear), but its closure is the entire function space. The depth and nonlinearity of the architecture determine which function-space "directions" are accessible. Width and depth determine the "dimensionality" of the accessible function subspace.
Second-order optimisationThe Hessian H=2L(θ)H = \nabla^2 \mathcal{L}(\theta) of the training loss is a p×pp \times p matrix with most of its spectral mass concentrated in a low-dimensional subspace (the "bulk" subspace where curvature is large). The remaining pkp - k directions have near-zero curvature (flat directions). K-FAC, Shampoo, and SOAP all identify and exploit this curvature subspace by approximating the inverse Hessian in the curved subspace and using the identity in the flat directions. Natural gradient preconditions updates by the inverse Fisher (a positive semidefinite matrix): it aligns updates with the geometry of the curved subspace and suppresses movement in the flat directions.

Conceptual Bridge

Vector spaces and subspaces are the geometry underlying all of linear algebra. Every concept - linear independence, span, basis, dimension, rank, orthogonality, projections, eigenvalues - is ultimately a statement about the structure of subspaces. The eight axioms that define a vector space are simple; the richness emerges from the subspace hierarchy they generate.

The abstract framework is what makes the theory universal. The same theorems that govern arrows in a plane govern polynomials, matrices, functions, and probability distributions. Verifying the eight axioms once grants access to a century of results - for free, without reproof. This universality is not just mathematical elegance; it is practical power. When you recognise that the gradient vectors during training form a set of vectors in Rp\mathbb{R}^p and that Rp\mathbb{R}^p is a vector space, you immediately know that all of linear algebra applies: span, independence, subspaces, projections, and dimension all become tools for understanding training dynamics.

For AI in 2026, subspaces are not abstract. They are the architectural primitives of transformers (the residual stream, the attention subspace, the MLP subspace), the design variables of efficient fine-tuning (LoRA rank = subspace dimension), the diagnostic language of interpretability (which subspace does this head write to?), and the theoretical foundation of generalisation (gradient updates live in a low-dimensional subspace; the implicit bias selects the minimum-norm solution in the row space).

WHERE THIS MODULE SITS IN THE CURRICULUM


  Sets and Logic -> Functions -> Summation
          
  Proof Techniques
          
  Vectors and Spaces (geometry: what vectors are)
          
  Matrix Operations (computation: how to multiply matrices)
          
  Systems of Equations (solving: Gaussian elimination)
          
  Determinants (volume: the signed volume of a parallelepiped)
          
  Matrix Rank (dimensionality: the rank-nullity connection)
          
  Vector Spaces and Subspaces (structure) <- THIS MODULE
  
  The payoff: everything from earlier modules
  is now understood geometrically as a statement
  about subspaces. Rank = dimension of col space.
  Null space = kernel. Solutions of Ax=b live in
  a coset of null(A). Determinant = 0 iff the
  column space is a proper subspace of R.
          
  Eigenvalues and Decompositions (spectrum of a linear map)
  - invariant subspaces of T: revealed through its eigenvalues
  - eigenvector: 1D invariant subspace
  - spectral theorem: orthogonal direct sum of eigenspaces
  - SVD: complete subspace decomposition of any matrix
          
  Probability and Information Theory
  - probability distributions as vectors in function space
  - KL divergence as a geometry on the simplex
          
  Calculus and Optimisation
  - gradient lives in R, a vector space
  - Hessian's eigenspaces = directions of curvature
  - natural gradient uses curved subspace geometry
          
          LLM Mathematics

What comes next: Eigenvalues, Eigenvectors, and Matrix Decompositions.

The next module reveals the internal subspace structure of a linear map - the invariant subspaces that a matrix preserves, revealed through its eigenvalues and eigenvectors. An eigenvector spans a 1-dimensional invariant subspace; the spectral theorem decomposes any symmetric matrix into orthogonal 1-dimensional invariant subspaces; the SVD extends this to the complete subspace structure of any rectangular matrix. The four fundamental subspaces are decomposed even further - into singular subspaces aligned with singular values. All of this is the continuation of the subspace story begun in this module.


<- Back to Linear Algebra Basics | Next: Eigenvalues and Decompositions ->

Skill Check

Test this lesson

Answer 4 quick questions to lock in the lesson and feed your adaptive practice queue.

--
Score
0/4
Answered
Not attempted
Status
1

Which module does this lesson belong to?

2

Which section is covered in this lesson content?

3

Which term is most central to this lesson?

4

What is the best way to use this lesson for real learning?

Your answers save locally first, then sync when account storage is available.
Practice queue