Lesson overview | Previous part | Lesson overview
Linear Transformations: Appendix N: Notation Summary for This Section to Appendix P: Quick Reference - Common Linear Maps in and
Appendix N: Notation Summary for This Section
| Symbol | Meaning |
|---|---|
| Linear transformation from to | |
| Kernel (null space) of | |
| Image (range) of | |
| Dimension of | |
| Dimension of | |
| Matrix of from basis to basis | |
| Change-of-basis matrix | |
| Dual (transpose) map | |
| Dual space of | |
| Total derivative (Frechet derivative) of at | |
| Jacobian matrix of at | |
| Space of all linear maps from to | |
| and are isomorphic | |
| Quotient space of modulo subspace | |
| and are similar matrices () | |
| Projection (idempotent) | |
| Orthogonal matrix | |
| Affine map | |
| LoRA low-rank update, |
Appendix O: Linear Maps and Symmetry
O.1 Equivariant Maps
A linear map is equivariant with respect to a group if, for every and every :
where and are representations of on and .
Intuitively: "applying the group action then the map = applying the map then the group action." The map commutes with the symmetry.
Examples:
- Translation equivariance: ... but this is additivity - every linear map is equivariant to the translation group on vector spaces.
- Rotation equivariance: for all rotations . In 3D: implies for some scalar (Schur's lemma for the rotation representation).
- Permutation equivariance: for all permutation matrices . Implies is a sum of a "same-position" term and a "mean-field" term - this is why mean pooling and attention with tied weights are permutation equivariant.
For AI: CNNs achieve translation equivariance by using convolutional (shared-weight) linear maps. Equivariant graph neural networks use permutation-equivariant maps. Geometric deep learning is the systematic study of building neural networks as compositions of equivariant linear maps. Transformer attention (without positional encoding) is permutation equivariant - adding positional encodings explicitly breaks this symmetry.
O.2 Schur's Lemma and Irreducible Representations
Schur's Lemma. Let be a linear map that commutes with all maps in an irreducible representation of a group (i.e., for all ). Then for some scalar .
This powerful result says: the only linear maps that commute with all symmetries of an irreducible representation are scalar multiples of the identity. This constrains the form of equivariant maps.
Application: If attention weights must be equivariant to the representation of a certain symmetry group acting on the heads, Schur's lemma constrains the possible attention patterns.
O.3 Representation Theory Preview
Representation theory studies how groups act on vector spaces via linear maps. Every group representation is a group homomorphism - a map that takes group elements to invertible linear maps, preserving the group structure:
This is the language in which equivariant neural networks (E(3)-equivariant networks for molecular property prediction, SE(3)-equivariant networks for robotics, permutation-equivariant networks for sets) are designed. The "weights" of an equivariant linear layer are constrained to be equivariant - and representation theory tells you exactly what form these weights can take.
Appendix P: Quick Reference - Common Linear Maps in and
Common Linear Maps
| Transformation | Matrix | Properties |
|---|---|---|
| Rotation by | Orthogonal, , $ | |
| Reflection across -axis | Symmetric, , | |
| Reflection across | Symmetric, , | |
| Horizontal shear by | , (double) | |
| Scaling by | Symmetric, , | |
| Projection onto -axis | Symmetric, idempotent, , | |
| Projection onto | Symmetric, idempotent, | |
| Zero map | Rank 0, , | |
| Identity | All eigenvalues 1, |
All these are linear maps. To make them affine (include translation), append a row and column in homogeneous form.
Common Linear Maps
| Transformation | Description | Key Properties |
|---|---|---|
| Rotation around -axis | : rotates -plane, fixes | Orthogonal, |
| Reflection across -plane | Symmetric, | |
| Projection onto -plane | Symmetric, idempotent, rank 2 | |
| Householder reflection | Symmetric, , (mult. 2) and | |
| Scaling | Diagonal; eigenvalues are | |
| Shear | () | ; all eigenvalues 1 |
End of Linear Transformations section. Continue to 05: Orthogonality and Orthonormality.