Part 4

5 min read8 headingsSplit lesson page

Lesson overview | Previous part | Lesson overview

Linear Transformations: Appendix N: Notation Summary for This Section to Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Appendix N: Notation Summary for This Section

Symbol	Meaning
$T: V \to W$	Linear transformation from $V$ to $W$
$\ker(T)$	Kernel (null space) of $T$
$\operatorname{im}(T)$	Image (range) of $T$
$\operatorname{rank}(T)$	Dimension of $\operatorname{im}(T)$
$\operatorname{nullity}(T)$	Dimension of $\ker(T)$
$[T]_{\mathcal{B}}^{\mathcal{C}}$	Matrix of $T$ from basis $\mathcal{B}$ to basis $\mathcal{C}$
$P$	Change-of-basis matrix
$T^\top$	Dual (transpose) map
$V^*$	Dual space of $V$
$Df_{\mathbf{x}}$	Total derivative (Frechet derivative) of $f$ at $\mathbf{x}$
$J_f(\mathbf{x})$	Jacobian matrix of $f$ at $\mathbf{x}$
$\mathcal{L}(V, W)$	Space of all linear maps from $V$ to $W$
$V \cong W$	$V$ and $W$ are isomorphic
$V/K$	Quotient space of $V$ modulo subspace $K$
$A \sim B$	$A$ and $B$ are similar matrices ( $B = P^{-1}AP$ )
$P^2 = P$	Projection (idempotent)
$A^\top A = I$	Orthogonal matrix
$f(\mathbf{x}) = A\mathbf{x} + \mathbf{b}$	Affine map
$\Delta W = BA$	LoRA low-rank update, $\operatorname{rank} \leq r$

Appendix O: Linear Maps and Symmetry

O.1 Equivariant Maps

A linear map $T: V \to W$ is equivariant with respect to a group $G$ if, for every $g \in G$ and every $\mathbf{v} \in V$ :

T(\rho_V(g)\mathbf{v}) = \rho_W(g) T(\mathbf{v})

where $\rho_V: G \to GL(V)$ and $\rho_W: G \to GL(W)$ are representations of $G$ on $V$ and $W$ .

Intuitively: "applying the group action then the map = applying the map then the group action." The map commutes with the symmetry.

Examples:

Translation equivariance: $T(v + c) = T(v) + T(c)$ ... but this is additivity - every linear map is equivariant to the translation group on vector spaces.
Rotation equivariance: $T(R\mathbf{v}) = RT(\mathbf{v})$ for all rotations $R$ . In 3D: implies $T = \lambda I$ for some scalar $\lambda$ (Schur's lemma for the rotation representation).
Permutation equivariance: $T(P\mathbf{v}) = PT(\mathbf{v})$ for all permutation matrices $P$ . Implies $T$ is a sum of a "same-position" term and a "mean-field" term - this is why mean pooling and attention with tied weights are permutation equivariant.

For AI: CNNs achieve translation equivariance by using convolutional (shared-weight) linear maps. Equivariant graph neural networks use permutation-equivariant maps. Geometric deep learning is the systematic study of building neural networks as compositions of equivariant linear maps. Transformer attention (without positional encoding) is permutation equivariant - adding positional encodings explicitly breaks this symmetry.

O.2 Schur's Lemma and Irreducible Representations

Schur's Lemma. Let $T: V \to V$ be a linear map that commutes with all maps in an irreducible representation of a group $G$ (i.e., $T\rho(g) = \rho(g)T$ for all $g \in G$ ). Then $T = \lambda I$ for some scalar $\lambda$ .

This powerful result says: the only linear maps that commute with all symmetries of an irreducible representation are scalar multiples of the identity. This constrains the form of equivariant maps.

Application: If attention weights must be equivariant to the representation of a certain symmetry group acting on the heads, Schur's lemma constrains the possible attention patterns.

O.3 Representation Theory Preview

Representation theory studies how groups act on vector spaces via linear maps. Every group representation $\rho: G \to GL(V)$ is a group homomorphism - a map that takes group elements to invertible linear maps, preserving the group structure:

\rho(gh) = \rho(g)\rho(h) \quad \text{(composition respects group multiplication)}

This is the language in which equivariant neural networks (E(3)-equivariant networks for molecular property prediction, SE(3)-equivariant networks for robotics, permutation-equivariant networks for sets) are designed. The "weights" of an equivariant linear layer are constrained to be equivariant - and representation theory tells you exactly what form these weights can take.

Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Common $2 \times 2$ Linear Maps

Transformation	Matrix	Properties
Rotation by $\theta$	$\begin{pmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{pmatrix}$	Orthogonal, $\det=1$ , $
Reflection across $x$ -axis	$\begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}$	Symmetric, $\det=-1$ , $\lambda = \pm 1$
Reflection across $y=x$	$\begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix}$	Symmetric, $\det=-1$ , $\lambda = \pm 1$
Horizontal shear by $k$	$\begin{pmatrix}1 & k \\ 0 & 1\end{pmatrix}$	$\det=1$ , $\lambda = 1$ (double)
Scaling by $(a, b)$	$\begin{pmatrix}a & 0 \\ 0 & b\end{pmatrix}$	Symmetric, $\det=ab$ , $\lambda = a, b$
Projection onto $x$ -axis	$\begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix}$	Symmetric, idempotent, $\det=0$ , $\lambda=0,1$
Projection onto $y=x$	$\frac{1}{2}\begin{pmatrix}1 & 1 \\ 1 & 1\end{pmatrix}$	Symmetric, idempotent, $\lambda=0,1$
Zero map	$\begin{pmatrix}0 & 0 \\ 0 & 0\end{pmatrix}$	Rank 0, $\det=0$ , $\lambda=0$
Identity	$\begin{pmatrix}1 & 0 \\ 0 & 1\end{pmatrix}$	All eigenvalues 1, $\det=1$

All these are linear maps. To make them affine (include translation), append a row and column in homogeneous form.

Common $3 \times 3$ Linear Maps

Transformation	Description	Key Properties
Rotation around $z$ -axis	$R_z(\theta)$ : rotates $xy$ -plane, fixes $z$	Orthogonal, $\det=1$
Reflection across $xy$ -plane	$\operatorname{diag}(1,1,-1)$	Symmetric, $\det=-1$
Projection onto $xy$ -plane	$\operatorname{diag}(1,1,0)$	Symmetric, idempotent, rank 2
Householder reflection	$I - 2\mathbf{n}\mathbf{n}^\top$	Symmetric, $\det=-1$ , $\lambda = 1$ (mult. 2) and $-1$
Scaling	$\operatorname{diag}(a,b,c)$	Diagonal; eigenvalues are $a,b,c$
Shear	$I + s\,\mathbf{e}_i\mathbf{e}_j^\top$ ( $i \neq j$ )	$\det=1$ ; all eigenvalues 1

End of Linear Transformations section. Continue to 05: Orthogonality and Orthonormality.

Linear Transformations: Part 4 - Appendix N Notation Summary For This Section To Appendix P Quick Refer

Linear Transformations: Appendix N: Notation Summary for This Section to Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Appendix N: Notation Summary for This Section

Appendix O: Linear Maps and Symmetry

O.1 Equivariant Maps

O.2 Schur's Lemma and Irreducible Representations

O.3 Representation Theory Preview

Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Common $2 \times 2$ Linear Maps

Common $3 \times 3$ Linear Maps

Test this lesson

Which module does this lesson belong to?

Which section is covered in this lesson content?

Which term is most central to this lesson?

What is the best way to use this lesson for real learning?

Linear Transformations: Part 4 - Appendix N Notation Summary For This Section To Appendix P Quick Refer

Linear Transformations: Appendix N: Notation Summary for This Section to Appendix P: Quick Reference - Common Linear Maps in R2\mathbb{R}^2R2 and R3\mathbb{R}^3R3

Appendix N: Notation Summary for This Section

Appendix O: Linear Maps and Symmetry

O.1 Equivariant Maps

O.2 Schur's Lemma and Irreducible Representations

O.3 Representation Theory Preview

Appendix P: Quick Reference - Common Linear Maps in R2\mathbb{R}^2R2 and R3\mathbb{R}^3R3

Common 2×22 \times 22×2 Linear Maps

Common 3×33 \times 33×3 Linear Maps

Test this lesson

Which module does this lesson belong to?

Which section is covered in this lesson content?

Which term is most central to this lesson?

What is the best way to use this lesson for real learning?

Linear Transformations: Appendix N: Notation Summary for This Section to Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Appendix P: Quick Reference - Common Linear Maps in $\mathbb{R}^2$ and $\mathbb{R}^3$

Common $2 \times 2$ Linear Maps

Common $3 \times 3$ Linear Maps