Lesson overview | Previous part | Lesson overview
Random Graphs: Appendix L: Worked Examples to Appendix Q: Quick Reference Card
Appendix L: Worked Examples
L.1 Computing the Giant Component Fraction
Problem: For with , find the theoretical fraction of vertices in the giant component.
Solution: We need satisfying .
Define . We need .
- (trivial solution - subcritical regime would have this as the only root)
- ; (slope at 0 is positive)
- is concave for , so there is one non-trivial root
Newton's method: Starting from :
- (overshoot, try smaller)
Using bisection between 0.7 and 0.95:
So - about 89.2% of vertices are in the giant component.
Interpretation: At average degree 2.5, the network is well into the supercritical regime. The vast majority of vertices are connected in one giant component, leaving only about 10.8% in small satellite components.
L.2 Kesten-Stigum Threshold Calculation
Problem: For the 2-block SBM with and , determine: (a) Is community detection possible? (b) Can we achieve exact recovery?
Solution:
(a) Detection (weak recovery): Kesten-Stigum threshold:
OK - Detection is possible.
SNR .
(b) Exact recovery: Need (scaled threshold for logarithmic degree regime).
, .
OK - Exact recovery is achievable.
Interpretation: With and , we have a relatively easy community detection problem - both detection and exact recovery are achievable. Spectral clustering should work well here.
L.3 Small-World Parameter Calculation
Problem: Design a WS graph on nodes that approximates the C. elegans neural connectome: , , .
Solution:
Step 1: Initial ring lattice has .
Step 2: Target clustering . Using :
Step 3: Verify path length at . Using the WS approximation where for large :
This is too small - the approximation breaks down at large . Numerically, for , , which is reasonably close to the target .
Conclusion: WS parameters produce a graph close to C. elegans in both clustering and path length, validating the small-world model.
L.4 Graphon Estimation from Data
Problem: Given a graph with 100 nodes known to be sampled from a 3-block SBM, estimate the graphon.
Procedure:
-
Run spectral clustering with to get estimated community labels .
-
Sort vertices by : reorder adjacency matrix so community 1 nodes come first, then community 2, then community 3.
-
Estimate block probabilities: For communities :
- Construct step-function graphon:
(piecewise constant on divided into blocks)
- Evaluate quality: Compute cut distance where is the true graphon. The cut distance converges to 0 as .
Error bound: For a -block SBM, the best graphon estimator achieves error in cut distance (minimax optimal). With and , expected cut distance - substantial estimation uncertainty.
Appendix M: Connections to Statistical Physics
M.1 Random Graphs and Spin Glasses
The Stochastic Block Model community detection problem is isomorphic to the ferromagnetic Ising model on the graph:
- Community labels correspond to spin states
- Edges within communities are "ferromagnetic" (prefer aligned spins)
- Edges between communities are "antiferromagnetic" (prefer anti-aligned)
The Kesten-Stigum threshold corresponds to the Nishimori temperature in the disordered Ising model - the temperature at which the magnetization (overlap with ground truth) first becomes nonzero.
Belief propagation = Cavity method: The belief propagation algorithm for community detection in SBM is the Bethe approximation applied to the Ising model. Near the KS threshold, BP is asymptotically optimal - no polynomial-time algorithm can do better.
M.2 Percolation and Statistical Mechanics
Bond percolation on is a model in statistical mechanics. Its critical phenomena:
- Order parameter:
- Critical exponent: with (exact, 2D)
- Correlation length: with (exact, 2D)
For ER graphs: with critical exponent 1 (mean-field universality class, since ER has "infinite dimension").
Conformal invariance (2D percolation): Near criticality, 2D percolation is conformally invariant - the scaling limit is described by SLE (Schramm-Loewner Evolution). This deep connection between combinatorics and complex analysis has no direct ML application yet, but illustrates the richness of the field.
M.3 Free Energy and the Replica Method
The replica method is a non-rigorous but powerful physics technique for computing expectations over random graphs.
For the SBM community detection problem, the free energy (log-partition function) per vertex is:
where and is the log-likelihood of the community assignment.
The replica calculation predicts the exact phase diagram of the SBM - both the KS threshold and the exact recovery threshold - and was later made rigorous using interpolation methods (Guerra, Talagrand).
ML connection: Free energy calculations in statistical physics are the rigorous foundation of variational inference in machine learning. The Bethe free energy (used in belief propagation) is the physics analog of the ELBO (evidence lower bound) in variational autoencoders.
End of 06-Random-Graphs notes.
Appendix N: Extended Exercise Solutions
N.1 Solution Notes for Exercise 1 (Phase Transition)
Key implementation details:
The fixed-point equation can be solved numerically via Newton's method:
Starting from (one step of Picard iteration), Newton's method converges in 5-10 iterations for .
For : the only fixed point is (return 0). For : there are two fixed points (0 and the positive root); return the positive root.
Expected results table:
| Theoretical | Simulated (variance) | |
|---|---|---|
| 0.5 | 0 | |
| 0.8 | 0 | |
| 1.0 | 0 (but scaling) | |
| 1.5 | 0.583 | |
| 2.0 | 0.797 | |
| 2.5 | 0.892 | |
| 3.0 | 0.940 |
Why peaks at criticality: The second-largest component size peaks at (size ) because this is where the giant component is just forming - there are many large components competing. For , the giant component absorbs everything, leaving only satellites.
N.2 Solution Notes for Exercise 4 (Community Detection)
Spectral algorithm implementation:
def spectral_community_detection(A, k=2):
"""Spectral clustering for SBM community detection."""
n = A.shape[0]
# Compute degree matrix and normalized Laplacian
d = A.sum(axis=1)
D_inv_sqrt = np.diag(1.0 / np.sqrt(d + 1e-10))
L_sym = np.eye(n) - D_inv_sqrt @ A @ D_inv_sqrt
# Eigendecomposition (smallest eigenvalues = community structure)
eigvals, eigvecs = np.linalg.eigh(L_sym)
# Second eigenvector (first non-trivial)
u2 = eigvecs[:, 1]
# Threshold at 0
labels = (u2 > 0).astype(int)
return labels
def community_accuracy(labels_est, labels_true):
"""Fraction correct, up to global flip."""
acc1 = np.mean(labels_est == labels_true)
acc2 = np.mean(labels_est == (1 - labels_true))
return max(acc1, acc2)
Expected accuracy table:
| SNR = | Regime | Expected accuracy | |
|---|---|---|---|
| Well above KS | 0.95-0.99 | ||
| Below KS | (random) | ||
| Below KS | (random) | ||
| Just above KS | 0.55-0.65 |
N.3 Solution Notes for Exercise 8 (Preferential Attachment)
Efficient alias sampling for preferential attachment:
The naive approach (linear scan to sample proportional to degree) is per edge, giving total. For large , use the alias method:
- Maintain a list of pairs for all existing nodes
- Use the alias method to sample from this distribution in per sample
- After adding new edges, update the alias table in per step
Alternatively, use the Vose-Walker alias method which supports updates efficiently.
Simpler approach: Maintain a binary indexed tree (Fenwick tree) over cumulative degrees. Sampling is per edge, updating is per edge, giving total.
Power law fit: Maximum likelihood estimator for power-law exponent on data with :
For BA with and : expect for .
Appendix O: Further Reading
O.1 Textbooks
-
Bollobas, B. (2001). Random Graphs (2nd ed.). Cambridge University Press.
- The definitive mathematical treatment; covers ER theory exhaustively.
-
Janson, S., Luczak, T., Rucinski, A. (2000). Random Graphs. Wiley.
- More accessible than Bollobas; covers first and second moment methods thoroughly.
-
Lovasz, L. (2012). Large Networks and Graph Limits. AMS.
- The definitive treatment of graphon theory by its creator.
-
Durrett, R. (2007). Random Graph Dynamics. Cambridge University Press.
- Dynamical random graphs and their applications to epidemics and social networks.
O.2 Foundational Papers
-
Erdos, P., Renyi, A. (1960). On the evolution of random graphs. Magyar Tud. Akad. Mat. Kutato Int. Kozl., 5, 17-61.
-
Watts, D.J., Strogatz, S.H. (1998). Collective dynamics of 'small-world' networks. Nature, 393, 440-442.
-
Barabasi, A.-L., Albert, R. (1999). Emergence of scaling in random networks. Science, 286, 509-512.
-
Lovasz, L., Szegedy, B. (2006). Limits of dense graph sequences. J. Combin. Theory Ser. B, 96, 933-957.
-
Mossel, E., Neeman, J., Sly, A. (2015). Reconstruction and estimation in the planted partition model. Probab. Theory Related Fields, 162, 431-461.
-
Abbe, E., Sandon, C. (2015). Community detection in the stochastic block models by spectral methods. arXiv:1503.00609.
O.3 ML-Specific Papers
-
Palowitch, J. et al. (2022). GraphWorld: Fake Graphs Bring Real Insights for GNNs. KDD 2022.
-
Keriven, N., Peyre, G. (2019). Universal Invariant and Equivariant Graph Neural Networks. NeurIPS 2019.
-
Vignac, C. et al. (2022). DiGress: Discrete Denoising diffusion for graph generation. ICLR 2023.
-
Rusch, T.K., Bronstein, M.M., Mishra, S. (2023). A survey on oversmoothing in graph neural networks. arXiv:2303.10993.
-
Broido, A.D., Clauset, A. (2019). Scale-free networks are rare. Nature Communications, 10, 1017.
<- Back to Graph Theory | Next: Graph Algorithms ->
Appendix P: Summary of Key Inequalities
P.1 Concentration Inequalities for Random Graphs
Markov's inequality: For non-negative :
Chebyshev's inequality:
Chernoff bound (Poisson random variables): For :
Paley-Zygmund inequality: For non-negative with finite second moment:
Azuma-Hoeffding (Martingale concentration): If is a martingale with :
For random graph properties: reveal edges one at a time. Each edge revelation changes the property value by at most (Lipschitz constant of the property). Azuma-Hoeffding gives concentration of order around the mean.
McDiarmid's inequality (Bounded differences): If changes by at most when one edge is added/removed:
Applications: triangle count ( per edge change), largest clique (), chromatic number ().
P.2 Spectral Inequalities
Perron-Frobenius: For non-negative irreducible , and the leading eigenvector has all positive entries.
Courant-Fischer (Min-Max): The -th eigenvalue of symmetric satisfies:
Weyl's theorem: For symmetric and :
Bauer-Fike: If is diagonalizable with condition number , eigenvalue perturbations are bounded by . For symmetric : (always), recovering Weyl's theorem.
Cheeger inequality: For graph Laplacian , the Cheeger constant satisfies:
This is the fundamental connection between algebraic (spectral gap) and geometric (cut quality) properties of graphs.
<- Back to Graph Theory | Next: Graph Algorithms ->
Appendix Q: Quick Reference Card
RANDOM GRAPH QUICK REFERENCE
========================================================================
ERDOS-RENYI G(n,p)
------------------
- Average degree: np
- Degree dist: Poisson(np) as n->\infty
- Giant component: exists iff p > 1/n -> fraction \beta(c) w/ c=np
- Connectivity: w.h.p. iff p \geq ln(n)/n
- Diameter: ~ log(n)/log(np) when connected
- Clustering: ~ p -> 0 (locally tree-like)
WATTS-STROGATZ (n, k, \beta)
-------------------------
- Degree: \approx k (Poisson-like spread from rewiring)
- Clustering: C(0)(1-\beta)^3 where C(0) \approx 3/4 for large k
- Path length: O(log n) for any \beta > 0
- Small-world: \beta \in [1/(nk), 0.3] gives C\ggC_ER and L\approxL_ER
BARABASI-ALBERT (n, m)
-----------------------
- Degree dist: P(k) ~ 2m^2/k^3 (power law \gamma=3)
- Max degree: ~ m\sqrtn (oldest node)
- Diameter: ~ log(n)/log(log(n))
- Clustering: C ~ (log n)^2/n -> 0 (no clustering!)
STOCHASTIC BLOCK MODEL SBM(n,k,\sigma,B)
-------------------------------------
- KS threshold: (a-b)^2 = 2(a+b) for 2-block, p=a/n, q=b/n
- Exact recovery: (\sqrta - \sqrtb)^2 = 2 for log-degree regime
- Spectral alg: works above KS threshold
- GNN limit: No GNN beats KS threshold on SBM data
GRAPHONS W:[0,1]^2->[0,1]
-------------------------
- Dense graph limit: every dense G-sequence converges to a graphon
- Sampling: draw \xi^i~U[0,1]; edge (i,j) w.p. W(\xi^i,\xi_j)
- ER graphon: W(x,y) = p (constant)
- SBM graphon: W(x,y) = B_{\lceilkx\rceil,\lceilky\rceil} (step function)
- Operator: T_W h(x) = \int_0^1 W(x,y)h(y)dy (Hilbert-Schmidt)
========================================================================