Private notes
0/8000

Notes stay private to your browser until account sync is configured.

Part 3
7 min read3 headingsSplit lesson page

Lesson overview | Previous part | Lesson overview

Spectral Graph Theory: Part 14: Exercises to 16. Conceptual Bridge

14. Exercises

Exercise 1 * - Laplacian Construction and Properties

For the following graph GG: vertices {1,2,3,4}\{1, 2, 3, 4\}, edges {(1,2),(2,3),(3,4),(4,1),(1,3)}\{(1,2), (2,3), (3,4), (4,1), (1,3)\} (an unweighted undirected graph):

(a) Write out AA, DD, and L=DAL = D - A. (b) Verify xLx=(i,j)E(xixj)2\mathbf{x}^\top L \mathbf{x} = \sum_{(i,j)\in E}(x_i - x_j)^2 for x=[1,2,0,1]\mathbf{x} = [1, 2, 0, -1]^\top. (c) Compute all eigenvalues of LL. How many connected components does GG have? (d) Compute LsymL_{\text{sym}} and verify its eigenvalues lie in [0,2][0, 2]. (e) For AI: Which eigenvector of LL would be used for spectral bisection? What partition does it suggest?


Exercise 2 * - Spectrum of Special Graphs

(a) Derive the eigenvalues of LL for the cycle graph C5C_5 (5 vertices in a cycle). Show all work. (b) Compute λ2(C5)\lambda_2(C_5). Is the cycle more or less connected (in the algebraic sense) than the path P5P_5? Use the formula for λ2(Pn)\lambda_2(P_n). (c) For the complete graph KnK_n: prove that λ2(LKn)=n\lambda_2(L_{K_n}) = n and all λ2,,λn\lambda_2, \ldots, \lambda_n are equal. (d) For a star graph SnS_n (one hub, n1n-1 leaves): find all eigenvalues and explain geometrically why λ2=1\lambda_2 = 1 regardless of nn.


Exercise 3 * - Fiedler Vector and Graph Bisection

Given a barbell graph: two cliques K5K_5 connected by a single bridge edge (u,v)(u, v):

(a) Describe qualitatively what the Fiedler vector looks like without computing it. Which vertices get positive values? Negative? (b) Implement the graph in NumPy, compute LL, and find u2\mathbf{u}_2 using scipy.linalg.eigh. Plot the Fiedler vector values at each vertex. (c) Use the Fiedler vector to perform spectral bisection. What is E(S,Sˉ)|E(S, \bar{S})| for the resulting cut? (d) What is λ2\lambda_2 for this graph? Is it close to 0? What does this say about the graph's connectivity?


Exercise 4 ** - Cheeger Inequality Verification

For the path graph P10P_{10} (10 vertices in a line):

(a) Compute λ2(Lsym)\lambda_2(L_{\text{sym}}) analytically using the known formula. (b) Find the Cheeger constant h(P10)h(P_{10}) by enumerating the optimal cut. (Hint: by symmetry, the optimal cut is in the middle.) (c) Verify that Cheeger's inequality λ2/2h2λ2\lambda_2/2 \leq h \leq \sqrt{2\lambda_2} holds. How tight are the bounds? (d) Implement the Fiedler vector sweep algorithm to find a cut with conductance 2λ2\leq \sqrt{2\lambda_2}. (e) For AI: If P10P_{10} were the attention graph of a 10-token sequence, what does the Cheeger constant tell you about information flow?


Exercise 5 ** - Graph Fourier Transform

Define a "community signal" x\mathbf{x} on the karate club graph (Zachary 1977, available in NetworkX) where xi=+1x_i = +1 if node ii is in community 1 and xi=1x_i = -1 otherwise.

(a) Compute the GFT x^=Ux\hat{\mathbf{x}} = U^\top \mathbf{x} of the community signal. (b) Plot x^k2|\hat{x}_k|^2 vs. λk\lambda_k. Is the community signal concentrated in low or high frequencies? (c) Define a "noisy" signal x=x+0.5η\mathbf{x}' = \mathbf{x} + 0.5\boldsymbol{\eta} where η\boldsymbol{\eta} is Gaussian noise. Apply a low-pass filter g(λ)=1[λλ5]g(\lambda) = \mathbf{1}[\lambda \leq \lambda_5] to x\mathbf{x}' (keep only the first 5 frequency components). (d) Compare the filtered signal to the true community signal. What fraction of nodes are correctly assigned? (e) How does this connect to label propagation in semi-supervised learning?


Exercise 6 ** - Spectral Clustering

Generate a synthetic graph with 3 communities using the Stochastic Block Model:

  • 3 blocks of 50 nodes each
  • Intra-block edge probability pin=0.3p_{\text{in}} = 0.3, inter-block pout=0.02p_{\text{out}} = 0.02

(a) Compute the unnormalized Laplacian LL. Plot the first 6 eigenvalues. Where is the largest eigengap? (b) Implement the NJW spectral clustering algorithm (Ng-Jordan-Weiss) for k=3k=3. (c) Compute the accuracy of the spectral clustering (comparing to the known ground-truth communities, accounting for label permutations). (d) Repeat with pout=0.15p_{\text{out}} = 0.15 (near the phase transition). How does the clustering accuracy degrade? (e) For AI: How does the eigengap heuristic perform? Plot accuracy vs. kk to verify the correct number of clusters is detected.


Exercise 7 *** - Laplacian Positional Encodings

Implement Laplacian Positional Encodings (LapPE) and test them on a simple graph classification task:

(a) For each graph in a small graph dataset (or a synthetic set with 3 classes: path, cycle, star variants), compute the top-kk Laplacian eigenvectors as node features. (b) Handle sign ambiguity by randomly flipping signs of each eigenvector at each forward pass (as in Dwivedi et al., 2022). Show that a model trained this way is sign-invariant. (c) Implement RWPE as an alternative: RWPE(v)j=(Pj)vv\text{RWPE}(v)_j = (P^j)_{vv} for j=1,,kj = 1, \ldots, k. Compare LapPE and RWPE on the classification task. (d) Explain theoretically why RWPE avoids the sign ambiguity problem while LapPE does not. (e) For AI: In GPT-style attention, can you use RWPE to give the model a "graph-aware" positional encoding? What would this enable for graph reasoning tasks?


Exercise 8 *** - PageRank and Spectral Analysis

Construct a small directed graph representing a citation network (10 papers, edges from citing paper to cited paper):

(a) Implement power iteration to compute the PageRank vector π\boldsymbol{\pi} with damping factor α=0.85\alpha = 0.85. Verify convergence. (b) Compute the dominant eigenvalue and eigenvector of the Google matrix G=αP+(1α)11/nG = \alpha P + (1-\alpha)\mathbf{1}\mathbf{1}^\top/n directly via scipy.linalg.eig. Compare to the power iteration result. (c) Add a "dangling node" (a paper with no outgoing citations). How does this affect the Google matrix? How is it handled in practice? (d) Compute the mixing time: how many iterations of power iteration are needed to achieve π(t)π<106\lVert \boldsymbol{\pi}^{(t)} - \boldsymbol{\pi}^* \rVert < 10^{-6}? How does this relate to α\alpha? (e) For AI: In RLHF, model responses can be ranked using a directed preference graph. Implement PageRank-based ranking on a set of 5 responses with pairwise preference comparisons. How does it compare to simple win-count ranking?


15. Why This Matters for AI (2026 Perspective)

ConceptAI/ML Impact
Graph Laplacian spectrumFoundation of Graph Convolutional Networks (Kipf & Welling, 2017); GCN layer = first-order Chebyshev filter; used in every graph-based ML system
Fiedler vector / spectral bisectionGraph partitioning for distributed training (model parallel + pipeline parallel); partition the computation graph of a large model across devices
Cheeger inequalityQuantifies over-smoothing rate in deep GNNs; expanders over-smooth fastest; guides choice of GNN depth and skip-connection design
Spectral clusteringGold-standard for community detection in social networks, knowledge graphs, citation networks; used in data curation for LLM pretraining
Graph Fourier TransformSpectral convolution -> polynomial approximation -> spatial GNN: the entire GNN derivation hierarchy is a spectral story; ChebNet (Defferrard et al., 2016)
Laplacian Positional EncodingsLapPE in GPS (Rampasek et al., 2022); RWPE in graph Transformers; enables graph Transformers to be position-aware without hardcoded sequence order
Random walk mixingRWPE computation; node2vec walks; GraphSAGE neighborhood sampling; mixing time determines required walk length for meaningful embeddings
Heat kernel / diffusionGraph diffusion networks (Klicpera et al., 2019); APPNP; diffusion-based denoising on knowledge graphs; personalized PageRank for neighborhood aggregation
Spectral sparsificationFast GNN training on large graphs: sparsify the graph while preserving spectral properties; used in GraphSAINT, ClusterGCN
PageRankRLHF preference aggregation; importance weighting in retrieval-augmented generation; entity importance in knowledge graphs
Matrix-Tree theoremSpanning tree sampling for graph augmentation in self-supervised GNN training; tree-structured attention in structured state space models
Directed graph spectraAttention head analysis in mechanistic interpretability (Olsson et al., 2022); causal graph structure in causal LLMs; knowledge graph relation asymmetry

16. Conceptual Bridge

Where we came from. This section builds directly on three pillars:

  • 11-01 Graph Basics provided the combinatorial vocabulary: vertices, edges, paths, connectivity, bipartiteness. These definitions are the domain of spectral graph theory.
  • 11-02 Graph Representations introduced the adjacency matrix and Laplacian as data structures. We now treat them as linear operators with rich algebraic structure.
  • 03 Advanced Linear Algebra (eigenvalues, spectral theorem, PSD matrices) gave us the mathematical machinery. Spectral graph theory is linear algebra applied to graphs.

What this section proved. Starting from the simple definition L=DAL = D - A, we established:

  1. L0L \succeq 0 (proved via the quadratic form xLx=(i,j)(xixj)2\mathbf{x}^\top L \mathbf{x} = \sum_{(i,j)}(x_i - x_j)^2)
  2. The null space of LL encodes connected components
  3. λ2\lambda_2 quantifies connectivity (Fiedler, 1973)
  4. λ2\lambda_2 is tightly related to the minimum normalized cut (Cheeger, 1985; Alon-Milman, 1985)
  5. Eigenvectors of LL form a natural Fourier basis for graph signals
  6. Spectral clustering is the continuous relaxation of NP-hard graph partitioning
  7. The GCN layer is a first-order spectral filter (Kipf & Welling, 2017)

Where we are going. Two sections lie ahead:

11-05 Graph Neural Networks (immediate next): The spectral foundation developed here - GCN derivation, over-smoothing as diffusion, spectral filters - motivates the full GNN architecture zoo. The MPNN framework, GAT attention, GraphSAGE induction, and over-smoothing mitigations are all seen more clearly through the spectral lens.

12 Functional Analysis (next chapter): The spectral theory of the discrete graph Laplacian is a special case of the spectral theory of self-adjoint operators on Hilbert spaces. The Laplace-Beltrami operator on Riemannian manifolds is the continuous limit of the graph Laplacian. Kernel methods, Mercer's theorem, and Reproducing Kernel Hilbert Spaces (RKHS) generalize what we built here to infinite-dimensional settings.

SPECTRAL GRAPH THEORY IN THE CURRICULUM
============================================================================

  Chapter 02-03           Chapter 11                Chapter 12
  Linear Algebra          Graph Theory              Functional Analysis
  -------------           ------------              --------------------
  Eigenvalues     ------> 04 Spectral       ------> Laplace-Beltrami
  PSD matrices            Graph Theory               operator
  Spectral                     |                    Spectral measure
  theorem                      |                    RKHS
                               |
                    +----------+----------+
                    v                     v
               11-05 GNNs          22 Causal
               GCN, GAT             Inference
               GraphSAGE            Causal DAGs
               MPNN                 d-separation

  KEY RESULTS IN 04:
  ---------------------------------------------------------------------
  L = D - A \succeq 0          (proved via quadratic form)
  ker(L) = span of        (connected components theorem)
  component indicators
  Cheeger: \lambda_2/2 \leq h \leq \sqrt(2\lambda_2)  (connectivity <-> eigenvalue)
  GFT: x = U^Tx          (graph Fourier transform)
  GCN = 1st-order         (from ChebNet K=1, \lambda_max\approx2)
  Chebyshev filter

============================================================================

The unifying theme. Spectral graph theory teaches a single lesson: linear algebraic structure encodes combinatorial structure. The eigenvalues of a matrix you can compute in O(n3)O(n^3) reveal properties of the graph that are NP-hard to compute directly. This is the power of the spectral approach, and it is why spectral methods remain foundational even as spatial GNNs dominate in practice - the theory explains why the practice works.


<- Back to Graph Theory | Previous: Graph Algorithms <- | Next: Graph Neural Networks ->

Skill Check

Test this lesson

Answer 4 quick questions to lock in the lesson and feed your adaptive practice queue.

--
Score
0/4
Answered
Not attempted
Status
1

Which module does this lesson belong to?

2

Which section is covered in this lesson content?

3

Which term is most central to this lesson?

4

What is the best way to use this lesson for real learning?

Your answers save locally first, then sync when account storage is available.
Practice queue