Lesson overview | Previous part | Lesson overview
Spectral Graph Theory: Part 14: Exercises to 16. Conceptual Bridge
14. Exercises
Exercise 1 * - Laplacian Construction and Properties
For the following graph : vertices , edges (an unweighted undirected graph):
(a) Write out , , and . (b) Verify for . (c) Compute all eigenvalues of . How many connected components does have? (d) Compute and verify its eigenvalues lie in . (e) For AI: Which eigenvector of would be used for spectral bisection? What partition does it suggest?
Exercise 2 * - Spectrum of Special Graphs
(a) Derive the eigenvalues of for the cycle graph (5 vertices in a cycle). Show all work. (b) Compute . Is the cycle more or less connected (in the algebraic sense) than the path ? Use the formula for . (c) For the complete graph : prove that and all are equal. (d) For a star graph (one hub, leaves): find all eigenvalues and explain geometrically why regardless of .
Exercise 3 * - Fiedler Vector and Graph Bisection
Given a barbell graph: two cliques connected by a single bridge edge :
(a) Describe qualitatively what the Fiedler vector looks like without computing it. Which vertices get positive values? Negative?
(b) Implement the graph in NumPy, compute , and find using scipy.linalg.eigh. Plot the Fiedler vector values at each vertex.
(c) Use the Fiedler vector to perform spectral bisection. What is for the resulting cut?
(d) What is for this graph? Is it close to 0? What does this say about the graph's connectivity?
Exercise 4 ** - Cheeger Inequality Verification
For the path graph (10 vertices in a line):
(a) Compute analytically using the known formula. (b) Find the Cheeger constant by enumerating the optimal cut. (Hint: by symmetry, the optimal cut is in the middle.) (c) Verify that Cheeger's inequality holds. How tight are the bounds? (d) Implement the Fiedler vector sweep algorithm to find a cut with conductance . (e) For AI: If were the attention graph of a 10-token sequence, what does the Cheeger constant tell you about information flow?
Exercise 5 ** - Graph Fourier Transform
Define a "community signal" on the karate club graph (Zachary 1977, available in NetworkX) where if node is in community 1 and otherwise.
(a) Compute the GFT of the community signal. (b) Plot vs. . Is the community signal concentrated in low or high frequencies? (c) Define a "noisy" signal where is Gaussian noise. Apply a low-pass filter to (keep only the first 5 frequency components). (d) Compare the filtered signal to the true community signal. What fraction of nodes are correctly assigned? (e) How does this connect to label propagation in semi-supervised learning?
Exercise 6 ** - Spectral Clustering
Generate a synthetic graph with 3 communities using the Stochastic Block Model:
- 3 blocks of 50 nodes each
- Intra-block edge probability , inter-block
(a) Compute the unnormalized Laplacian . Plot the first 6 eigenvalues. Where is the largest eigengap? (b) Implement the NJW spectral clustering algorithm (Ng-Jordan-Weiss) for . (c) Compute the accuracy of the spectral clustering (comparing to the known ground-truth communities, accounting for label permutations). (d) Repeat with (near the phase transition). How does the clustering accuracy degrade? (e) For AI: How does the eigengap heuristic perform? Plot accuracy vs. to verify the correct number of clusters is detected.
Exercise 7 *** - Laplacian Positional Encodings
Implement Laplacian Positional Encodings (LapPE) and test them on a simple graph classification task:
(a) For each graph in a small graph dataset (or a synthetic set with 3 classes: path, cycle, star variants), compute the top- Laplacian eigenvectors as node features. (b) Handle sign ambiguity by randomly flipping signs of each eigenvector at each forward pass (as in Dwivedi et al., 2022). Show that a model trained this way is sign-invariant. (c) Implement RWPE as an alternative: for . Compare LapPE and RWPE on the classification task. (d) Explain theoretically why RWPE avoids the sign ambiguity problem while LapPE does not. (e) For AI: In GPT-style attention, can you use RWPE to give the model a "graph-aware" positional encoding? What would this enable for graph reasoning tasks?
Exercise 8 *** - PageRank and Spectral Analysis
Construct a small directed graph representing a citation network (10 papers, edges from citing paper to cited paper):
(a) Implement power iteration to compute the PageRank vector with damping factor . Verify convergence.
(b) Compute the dominant eigenvalue and eigenvector of the Google matrix directly via scipy.linalg.eig. Compare to the power iteration result.
(c) Add a "dangling node" (a paper with no outgoing citations). How does this affect the Google matrix? How is it handled in practice?
(d) Compute the mixing time: how many iterations of power iteration are needed to achieve ? How does this relate to ?
(e) For AI: In RLHF, model responses can be ranked using a directed preference graph. Implement PageRank-based ranking on a set of 5 responses with pairwise preference comparisons. How does it compare to simple win-count ranking?
15. Why This Matters for AI (2026 Perspective)
| Concept | AI/ML Impact |
|---|---|
| Graph Laplacian spectrum | Foundation of Graph Convolutional Networks (Kipf & Welling, 2017); GCN layer = first-order Chebyshev filter; used in every graph-based ML system |
| Fiedler vector / spectral bisection | Graph partitioning for distributed training (model parallel + pipeline parallel); partition the computation graph of a large model across devices |
| Cheeger inequality | Quantifies over-smoothing rate in deep GNNs; expanders over-smooth fastest; guides choice of GNN depth and skip-connection design |
| Spectral clustering | Gold-standard for community detection in social networks, knowledge graphs, citation networks; used in data curation for LLM pretraining |
| Graph Fourier Transform | Spectral convolution -> polynomial approximation -> spatial GNN: the entire GNN derivation hierarchy is a spectral story; ChebNet (Defferrard et al., 2016) |
| Laplacian Positional Encodings | LapPE in GPS (Rampasek et al., 2022); RWPE in graph Transformers; enables graph Transformers to be position-aware without hardcoded sequence order |
| Random walk mixing | RWPE computation; node2vec walks; GraphSAGE neighborhood sampling; mixing time determines required walk length for meaningful embeddings |
| Heat kernel / diffusion | Graph diffusion networks (Klicpera et al., 2019); APPNP; diffusion-based denoising on knowledge graphs; personalized PageRank for neighborhood aggregation |
| Spectral sparsification | Fast GNN training on large graphs: sparsify the graph while preserving spectral properties; used in GraphSAINT, ClusterGCN |
| PageRank | RLHF preference aggregation; importance weighting in retrieval-augmented generation; entity importance in knowledge graphs |
| Matrix-Tree theorem | Spanning tree sampling for graph augmentation in self-supervised GNN training; tree-structured attention in structured state space models |
| Directed graph spectra | Attention head analysis in mechanistic interpretability (Olsson et al., 2022); causal graph structure in causal LLMs; knowledge graph relation asymmetry |
16. Conceptual Bridge
Where we came from. This section builds directly on three pillars:
- 11-01 Graph Basics provided the combinatorial vocabulary: vertices, edges, paths, connectivity, bipartiteness. These definitions are the domain of spectral graph theory.
- 11-02 Graph Representations introduced the adjacency matrix and Laplacian as data structures. We now treat them as linear operators with rich algebraic structure.
- 03 Advanced Linear Algebra (eigenvalues, spectral theorem, PSD matrices) gave us the mathematical machinery. Spectral graph theory is linear algebra applied to graphs.
What this section proved. Starting from the simple definition , we established:
- (proved via the quadratic form )
- The null space of encodes connected components
- quantifies connectivity (Fiedler, 1973)
- is tightly related to the minimum normalized cut (Cheeger, 1985; Alon-Milman, 1985)
- Eigenvectors of form a natural Fourier basis for graph signals
- Spectral clustering is the continuous relaxation of NP-hard graph partitioning
- The GCN layer is a first-order spectral filter (Kipf & Welling, 2017)
Where we are going. Two sections lie ahead:
11-05 Graph Neural Networks (immediate next): The spectral foundation developed here - GCN derivation, over-smoothing as diffusion, spectral filters - motivates the full GNN architecture zoo. The MPNN framework, GAT attention, GraphSAGE induction, and over-smoothing mitigations are all seen more clearly through the spectral lens.
12 Functional Analysis (next chapter): The spectral theory of the discrete graph Laplacian is a special case of the spectral theory of self-adjoint operators on Hilbert spaces. The Laplace-Beltrami operator on Riemannian manifolds is the continuous limit of the graph Laplacian. Kernel methods, Mercer's theorem, and Reproducing Kernel Hilbert Spaces (RKHS) generalize what we built here to infinite-dimensional settings.
SPECTRAL GRAPH THEORY IN THE CURRICULUM
============================================================================
Chapter 02-03 Chapter 11 Chapter 12
Linear Algebra Graph Theory Functional Analysis
------------- ------------ --------------------
Eigenvalues ------> 04 Spectral ------> Laplace-Beltrami
PSD matrices Graph Theory operator
Spectral | Spectral measure
theorem | RKHS
|
+----------+----------+
v v
11-05 GNNs 22 Causal
GCN, GAT Inference
GraphSAGE Causal DAGs
MPNN d-separation
KEY RESULTS IN 04:
---------------------------------------------------------------------
L = D - A \succeq 0 (proved via quadratic form)
ker(L) = span of (connected components theorem)
component indicators
Cheeger: \lambda_2/2 \leq h \leq \sqrt(2\lambda_2) (connectivity <-> eigenvalue)
GFT: x = U^Tx (graph Fourier transform)
GCN = 1st-order (from ChebNet K=1, \lambda_max\approx2)
Chebyshev filter
============================================================================
The unifying theme. Spectral graph theory teaches a single lesson: linear algebraic structure encodes combinatorial structure. The eigenvalues of a matrix you can compute in reveal properties of the graph that are NP-hard to compute directly. This is the power of the spectral approach, and it is why spectral methods remain foundational even as spatial GNNs dominate in practice - the theory explains why the practice works.
<- Back to Graph Theory | Previous: Graph Algorithms <- | Next: Graph Neural Networks ->