Theory Notebook

Converted from theory.ipynb for web reading.

§11.02 Graph Representations — Theory Notebook

"The difference between a good algorithm and a great one often comes down to the choice of data structure."

Companion to notes.md. Every representation is implemented, analysed, and connected to GNN frameworks.

Sections

Setup
Adjacency Matrix — construction, properties, Laplacian
Degree Matrix, Laplacian, and Normalised Adjacency
GCN Propagation Step (worked example)
Adjacency List — implementations and adjacency set
Edge List and COO Format
PyG edge_index format
Sparse Matrix Formats: CSR, CSC, LIL, DOK
SpMV: worked trace and benchmark
Incidence Matrix and L = BB^T
Representation Conversions
Heterogeneous and Temporal Graphs
Choosing a Representation
Memory Benchmarks

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import scipy.sparse as sp
import scipy.sparse.linalg as spla
import collections, time

try:
    import matplotlib
    matplotlib.use('Agg')
    import matplotlib.pyplot as plt
    import matplotlib.patches as mpatches
    HAS_MPL = True
except ImportError:
    HAS_MPL = False

try:
    import networkx as nx
    HAS_NX = True
except ImportError:
    HAS_NX = False

COLORS = {
    'primary':   '#0077BB',
    'secondary': '#EE7733',
    'tertiary':  '#009988',
    'error':     '#CC3311',
    'neutral':   '#555555',
    'highlight': '#EE3377',
}

np.set_printoptions(precision=4, suppress=True)
np.random.seed(42)
print('Setup complete. NetworkX:', HAS_NX, '  Matplotlib:', HAS_MPL)

2. Adjacency Matrix

The adjacency matrix $A \in \{0,1\}^{n \times n}$ is the most fundamental graph representation. For an undirected simple graph:

A_{ij} = 1 \text{ if } \{i,j\} \in E, \quad 0 \text{ otherwise}

Key properties: symmetric ( $A = A^\top$ ), zero diagonal, row sums = degrees.

Code cell 5

# === 2.1 Build diamond graph adjacency matrix ===
# Vertices 0-3, edges: {0,1},{0,2},{1,2},{1,3},{2,3}
edges = [(0,1),(0,2),(1,2),(1,3),(2,3)]
n = 4

A = np.zeros((n, n), dtype=int)
for u, v in edges:
    A[u, v] = A[v, u] = 1

print('Adjacency matrix A:')
print(A)
print(f'\nSymmetric: {(A == A.T).all()}')
print(f'Zero diagonal: {(np.diag(A) == 0).all()}')
print(f'Row sums (degrees): {A.sum(axis=1)}')
print(f'Sum of degrees = {A.sum()} = 2*|E| = {2*len(edges)}')

Code cell 6

# === 2.2 Directed adjacency matrix ===
# Directed: 0→1, 0→2, 1→2, 1→3, 2→3
A_dir = np.zeros((n, n), dtype=int)
for u, v in edges:  # treat as directed edges u→v
    A_dir[u, v] = 1

print('Directed adjacency matrix A_dir:')
print(A_dir)
print(f'\nOut-degrees (row sums): {A_dir.sum(axis=1)}')
print(f'In-degrees  (col sums): {A_dir.sum(axis=0)}')
print(f'Symmetric: {(A_dir == A_dir.T).all()}  (expected False)')

Code cell 7

# === 2.3 Walk counting via matrix powers ===
# (A^k)_{ij} = number of walks of length k from i to j

for k in [1, 2, 3]:
    Ak = np.linalg.matrix_power(A, k)
    print(f'A^{k}:')
    print(Ak)
    if k == 2:
        print(f'  (A^2)_{{00}} = {Ak[0,0]}: walks of length 2 returning to vertex 0')
    if k == 3:
        print(f'  tr(A^3) = {int(np.trace(Ak))}')
        print(f'  Triangles = tr(A^3)/6 = {int(np.trace(Ak))//6}')
    print()

3. Degree Matrix, Laplacian, and Normalised Adjacency

The graph Laplacian $L = D - A$ is central to spectral graph theory and GNNs.

D = \operatorname{diag}(\deg(0), \ldots, \deg(n-1)), \quad L = D - A

Key properties:

$L \succeq 0$ (positive semidefinite)
$L\mathbf{1} = \mathbf{0}$ (zero row sums)
$\mathbf{x}^\top L \mathbf{x} = \sum_{\{i,j\}\in E}(x_i - x_j)^2$

Code cell 9

# === 3.1 Compute D, L, and verify PSD ===
degrees = A.sum(axis=1)
D = np.diag(degrees)
L = D - A

print('Degree matrix D:')
print(D)
print('\nLaplacian L = D - A:')
print(L)

# Verify zero row sums
print(f'\nL @ 1 = {L @ np.ones(n)}  (should be all zeros)')

# Verify PSD via eigenvalues
eigs = np.linalg.eigvalsh(L)
print(f'Eigenvalues of L: {eigs.round(4)}')
print(f'All eigenvalues >= 0: {(eigs >= -1e-10).all()}  → L is PSD')
print(f'Number of zero eigenvalues: {(np.abs(eigs) < 1e-10).sum()}  → {(np.abs(eigs) < 1e-10).sum()} connected component(s)')

Code cell 10

# === 3.2 Quadratic form: x^T L x = sum_{(i,j) in E} (x_i - x_j)^2 ===
x = np.array([1.0, 0.0, 0.0, -1.0])

# Direct formula
quadratic_direct = float(x @ L @ x)

# Edge-based formula
quadratic_edge = sum((x[u] - x[v])**2 for u, v in edges)

print(f'x = {x}')
print(f'x^T L x (direct)  = {quadratic_direct:.4f}')
print(f'sum (xi-xj)^2     = {quadratic_edge:.4f}')
print(f'PASS: both agree: {abs(quadratic_direct - quadratic_edge) < 1e-10}')
print(f'\nInterpretation: signal x assigns +1 to vertex 0, -1 to vertex 3.')
print(f'The Laplacian quadratic form measures the total variation across edges.')

Code cell 11

# === 3.3 Normalised adjacency A_hat = D^{-1/2} A D^{-1/2} ===
D_inv_sqrt = np.diag(1.0 / np.sqrt(degrees.astype(float)))
A_hat = D_inv_sqrt @ A @ D_inv_sqrt

print('Normalised adjacency A_hat = D^{-1/2} A D^{-1/2}:')
print(A_hat.round(4))

eigs_Ahat = np.linalg.eigvalsh(A_hat)
print(f'\nEigenvalues of A_hat: {eigs_Ahat.round(4)}')
print(f'All in [-1, 1]: {(np.abs(eigs_Ahat) <= 1 + 1e-10).all()}')

L_sym = np.eye(n) - A_hat
eigs_Lsym = np.linalg.eigvalsh(L_sym)
print(f'\nEigenvalues of L_sym = I - A_hat: {eigs_Lsym.round(4)}')
print(f'All in [0, 2]: {(eigs_Lsym >= -1e-10).all() and (eigs_Lsym <= 2+1e-10).all()}')

4. GCN Propagation Step (Worked Example)

The GCN layer (Kipf & Welling, 2017) computes:

H^{(l+1)} = \sigma\!\left(\tilde{D}^{-1/2} \tilde{A} \tilde{D}^{-1/2} H^{(l)} W^{(l)}\right)

where $\tilde{A} = A + I$ (self-loops) and $\tilde{D} = D + I$ .

Code cell 13

# === 4.1 Build GCN normalised adjacency (with self-loops) ===
A_tilde = A + np.eye(n, dtype=float)
d_tilde = A_tilde.sum(axis=1)  # degree in A_tilde
D_tilde_inv_sqrt = np.diag(1.0 / np.sqrt(d_tilde))
A_gcn = D_tilde_inv_sqrt @ A_tilde @ D_tilde_inv_sqrt

print('GCN normalised adjacency (with self-loops):')
print(A_gcn.round(4))

# Verify entries: A_gcn[i,j] = 1/sqrt(d_tilde[i]*d_tilde[j])
i, j = 0, 1
expected = 1.0 / np.sqrt(d_tilde[i] * d_tilde[j])
print(f'\nA_gcn[0,1] = {A_gcn[0,1]:.4f}, expected = {expected:.4f}, match: {abs(A_gcn[0,1]-expected)<1e-10}')

Code cell 14

# === 4.2 One GCN layer: H' = ReLU(A_gcn @ H @ W) ===
np.random.seed(7)
d_in, d_out = 4, 2  # input dim = n (identity features), output dim = 2

# Identity node features (each node's feature = its one-hot encoding)
H0 = np.eye(n)  # shape (n, n)

# Random weight matrix
W = np.random.randn(n, d_out) * 0.5  # shape (n, d_out)

# GCN layer
Z = A_gcn @ H0 @ W   # shape (n, d_out): aggregation + linear
H1 = np.maximum(0, Z)  # ReLU

print('Weight matrix W:')
print(W.round(4))
print('\nPre-activation Z = A_gcn @ H0 @ W:')
print(Z.round(4))
print('\nPost-ReLU H1:')
print(H1.round(4))
print('\nNote: row 1 (degree-3 node) receives more averaged signal than rows 0,3 (degree-2).')

5. Adjacency List

The adjacency list stores, per vertex, the list of its neighbours. Memory: $O(n+m)$ .

Code cell 16

# === 5.1 Python dict-of-lists adjacency list ===
adj_list = collections.defaultdict(list)
for u, v in edges:
    adj_list[u].append(v)
    adj_list[v].append(u)

print('Adjacency list:')
for v in range(n):
    print(f'  {v}: {sorted(adj_list[v])}')

# BFS using adjacency list
def bfs(adj, root, n):
    visited = [False]*n
    visited[root] = True
    queue = collections.deque([root])
    order = []
    while queue:
        v = queue.popleft()
        order.append(v)
        for u in adj[v]:
            if not visited[u]:
                visited[u] = True
                queue.append(u)
    return order

order = bfs(adj_list, root=0, n=n)
print(f'\nBFS order from vertex 0: {order}')

Code cell 17

# === 5.2 Adjacency set: O(1) edge lookup ===
adj_set = {v: set(nbrs) for v, nbrs in adj_list.items()}

# Edge queries
queries = [(0,1), (0,3), (1,2), (2,3)]
print('Edge existence queries (adjacency set):')
for u, v in queries:
    print(f'  ({u},{v}): {v in adj_set[u]}')

# Verify adjacency list vs set give same degree
for v in range(n):
    assert len(adj_list[v]) == len(adj_set[v])
print('\nPASS - adjacency list and set have consistent degrees')

Code cell 18

# === 5.3 Sorted adjacency list: O(log deg) edge lookup ===
import bisect

adj_sorted = {v: sorted(nbrs) for v, nbrs in adj_list.items()}

def has_edge_sorted(adj, u, v):
    idx = bisect.bisect_left(adj[u], v)
    return idx < len(adj[u]) and adj[u][idx] == v

print('Edge existence queries (sorted list, binary search):')
for u, v in queries:
    print(f'  ({u},{v}): {has_edge_sorted(adj_sorted, u, v)}')

# All should match adjacency set results
for u, v in queries:
    assert has_edge_sorted(adj_sorted, u, v) == (v in adj_set.get(u, set()))
print('\nPASS - sorted list agrees with adjacency set')

6. Edge List and COO Format

The edge list is the simplest representation; COO extends it to three parallel arrays.

Code cell 20

# === 6.1 Edge list ===
edge_list = list(edges)  # [(0,1),(0,2),(1,2),(1,3),(2,3)]
print('Edge list:', edge_list)
print(f'Space: O(m) = O({len(edge_list)})')

# === 6.2 COO format ===
row_arr = np.array([u for u,v in edge_list], dtype=np.int32)
col_arr = np.array([v for u,v in edge_list], dtype=np.int32)
# For undirected, add reverse edges
row_coo = np.concatenate([row_arr, col_arr])
col_coo = np.concatenate([col_arr, row_arr])
data_coo = np.ones(len(row_coo), dtype=np.float32)

print(f'\nCOO (undirected, {len(row_coo)} entries):')
print(f'  row:  {row_coo}')
print(f'  col:  {col_coo}')
print(f'  data: {data_coo.astype(int)}')

Code cell 21

# === 6.3 SciPy COO and CSR ===
A_coo = sp.coo_matrix((data_coo, (row_coo, col_coo)), shape=(n, n))
A_csr = A_coo.tocsr()

print('SciPy COO:')
print(f'  shape: {A_coo.shape}, nnz: {A_coo.nnz}')

print('\nSciPy CSR:')
print(f'  data:    {A_csr.data.astype(int)}')
print(f'  col_idx: {A_csr.indices}')
print(f'  row_ptr: {A_csr.indptr}')

# Verify matches dense adjacency
A_reconstructed = A_csr.toarray().astype(int)
print(f'\nMatches dense A: {(A_reconstructed == A).all()}')

7. PyTorch Geometric edge_index Format

PyG uses a COO tensor of shape $2 \times m$ to represent directed edges. Row 0 = sources, Row 1 = targets.

Code cell 23

# === 7.1 Build edge_index (simulated without torch) ===
# For undirected graph, include both (u,v) and (v,u)
src = np.array(row_coo, dtype=np.int64)
dst = np.array(col_coo, dtype=np.int64)
edge_index = np.stack([src, dst], axis=0)  # shape (2, 2m)

print('edge_index (shape 2 x 2m):')
print(edge_index)
print(f'Shape: {edge_index.shape}  (2 x {edge_index.shape[1]})')
print(f'Num directed edges: {edge_index.shape[1]} = 2 * {len(edges)} (undirected)')

# Verify: for each undirected edge, both directions present
edge_set_directed = set(zip(src.tolist(), dst.tolist()))
for u, v in edges:
    assert (u,v) in edge_set_directed and (v,u) in edge_set_directed
print('PASS - both directions present for all undirected edges')

Code cell 24

# === 7.2 Message passing using edge_index (scatter_add simulation) ===
# GNN aggregation: for each node v, aggregate features from in-neighbours
# H'[v] = sum_{u: (u,v) in edge_index} H[u]

# Simulated node features
H = np.array([[1., 0.], [0., 1.], [1., 1.], [0., 0.]])
print('Node features H (shape n x d):')
print(H)

# Scatter-add aggregation (Python simulation)
H_agg = np.zeros_like(H)
for k in range(edge_index.shape[1]):
    u_k = edge_index[0, k]
    v_k = edge_index[1, k]
    H_agg[v_k] += H[u_k]  # accumulate source features into destination

print('\nAggregated H_agg = scatter_add(H, edge_index):')
print(H_agg)
print('\nVerify: H_agg[0] = sum of neighbours of 0 = H[1] + H[2]:', H[1] + H[2])

8. Sparse Matrix Formats: CSR, CSC, LIL, DOK

CSR (Compressed Sparse Row) is the standard for efficient SpMV and GNN computations.

Code cell 26

# === 8.1 Manual CSR construction ===
# Build CSR for diamond graph from scratch

# Sort edges by source (row)
undirected_pairs = [(u,v) for u,v in edges] + [(v,u) for u,v in edges]
undirected_pairs.sort()  # sort by source

data_csr  = np.ones(len(undirected_pairs), dtype=np.int32)
col_idx   = np.array([v for u,v in undirected_pairs], dtype=np.int32)
row_ptr   = np.zeros(n + 1, dtype=np.int32)

# Count edges per row
for u, v in undirected_pairs:
    row_ptr[u + 1] += 1
# Prefix sum
for i in range(n):
    row_ptr[i+1] += row_ptr[i]

print('Manual CSR for diamond graph:')
print(f'  data:    {data_csr}')
print(f'  col_idx: {col_idx}')
print(f'  row_ptr: {row_ptr}')

# Verify row 1 (neighbours of vertex 1 = [0,2,3])
row1_nbrs = col_idx[row_ptr[1]:row_ptr[2]]
print(f'\nNeighbours of vertex 1: {sorted(row1_nbrs.tolist())} (expected [0,2,3])')
assert set(row1_nbrs.tolist()) == {0, 2, 3}, 'Row 1 neighbours wrong!'
print('PASS')

Code cell 27

# === 8.2 CSR vs CSC ===
A_csc = A_csr.tocsc()

print('CSR: efficient row (out-neighbour) access')
print(f'  Row 1 of CSR: {A_csr.getrow(1).indices}  (out-neighbours of 1)')

print('\nCSC: efficient column (in-neighbour) access')
print(f'  Col 1 of CSC: {A_csc.getcol(1).indices}  (in-neighbours of 1)')

# For undirected graphs, in-neighbours == out-neighbours
assert set(A_csr.getrow(1).indices) == set(A_csc.getcol(1).indices)
print('\nPASS - for undirected graph, in-nbrs == out-nbrs')

Code cell 28

# === 8.3 LIL and DOK (dynamic construction) ===
# LIL: efficient incremental construction
A_lil = sp.lil_matrix((n, n), dtype=float)
for u, v in edges:
    A_lil[u, v] = 1.0
    A_lil[v, u] = 1.0

print('LIL matrix data structure (rows as sorted lists):')
for i in range(n):
    print(f'  row {i}: cols={A_lil.rows[i]}')

# Convert LIL → CSR for computation
A_from_lil = A_lil.tocsr()
print(f'\nAfter tocsr(): shape={A_from_lil.shape}, nnz={A_from_lil.nnz}')

# DOK: dictionary of keys
A_dok = sp.dok_matrix((n, n), dtype=float)
for u, v in edges:
    A_dok[u, v] = A_dok[v, u] = 1.0
print(f'\nDOK matrix: {dict(list(A_dok.items())[:3])} ... ({A_dok.nnz} entries)')

9. Sparse Matrix-Vector Multiplication (SpMV)

SpMV $\mathbf{y} = A\mathbf{x}$ is the core operation in GNNs and PageRank. CSR achieves $O(m)$ — one multiply-add per non-zero.

Code cell 30

# === 9.1 Manual CSR SpMV with trace ===
x_vec = np.array([1., 2., 3., 4.])

def spmv_csr(data, col_idx, row_ptr, x):
    n = len(row_ptr) - 1
    y = np.zeros(n)
    for i in range(n):
        for k in range(row_ptr[i], row_ptr[i+1]):
            y[i] += data[k] * x[col_idx[k]]
    return y

y_sparse = spmv_csr(data_csr.astype(float), col_idx, row_ptr, x_vec)
y_dense  = A @ x_vec

print(f'x = {x_vec}')
print(f'SpMV result (CSR):   {y_sparse}')
print(f'Dense A @ x:         {y_dense}')
print(f'Agree: {np.allclose(y_sparse, y_dense)}')

# Interpretation: y[i] = sum of x[j] for j in neighbours of i
print('\nInterpretation: y[i] = sum of x-values over neighbours of i')
for i in range(n):
    nbrs = col_idx[row_ptr[i]:row_ptr[i+1]].tolist()
    print(f'  y[{i}] = sum(x[{nbrs}]) = {sum(x_vec[j] for j in nbrs):.0f} ✓')

Code cell 31

# === 9.2 SpMV benchmark: dense vs sparse ===
import time

sizes = [100, 500, 1000]
results = []

for sz in sizes:
    # Sparse Erdos-Renyi graph: n nodes, ~5n edges
    p = 10.0 / sz
    A_rand = np.random.rand(sz, sz) < p
    A_rand = ((A_rand | A_rand.T) & ~np.eye(sz, dtype=bool)).astype(np.float32)
    A_sp = sp.csr_matrix(A_rand)
    xv = np.random.randn(sz).astype(np.float32)
    fill = A_sp.nnz / sz**2

    # Dense
    t0 = time.perf_counter()
    for _ in range(200): _ = A_rand @ xv
    t_dense = (time.perf_counter() - t0) / 200 * 1e6

    # Sparse
    t0 = time.perf_counter()
    for _ in range(200): _ = A_sp @ xv
    t_sparse = (time.perf_counter() - t0) / 200 * 1e6

    results.append((sz, fill, t_dense, t_sparse, t_dense/t_sparse))
    print(f'n={sz:4d}: fill={fill:.3f}  dense={t_dense:.1f}µs  sparse={t_sparse:.1f}µs  speedup={t_dense/t_sparse:.1f}x')

10. Incidence Matrix and L = BB^⊤

The incidence matrix $B \in \{0,1\}^{n \times m}$ (rows = vertices, columns = edges) satisfies $L = BB^\top$ for undirected graphs.

Code cell 33

# === 10.1 Build incidence matrix for diamond graph ===
m_edges = len(edges)
B = np.zeros((n, m_edges), dtype=float)
for e_idx, (u, v) in enumerate(edges):
    B[u, e_idx] = 1.0
    B[v, e_idx] = 1.0

print('Incidence matrix B (n x m):')
print(B.astype(int))
print(f'\nShape: {B.shape} (n={n} vertices, m={m_edges} edges)')
print(f'Column sums (each edge has 2 endpoints): {B.sum(axis=0)}')
print(f'Row sums (= degrees): {B.sum(axis=1)}')

Code cell 34

# === 10.2 Verify L = B B^T ===
L_from_B = B @ B.T
print('BB^T:')
print(L_from_B.astype(int))
print('\nL = D - A:')
print(L)
print(f'\nL == BB^T: {np.allclose(L_from_B, L)}')
print('PASS: Kirchhoff identity verified')

# Corollary: L is PSD because BB^T = sum of outer products = PSD
print(f'\nPSD verification: x^T(BB^T)x = ||B^T x||^2 >= 0')
x_test = np.random.randn(n)
lhs = x_test @ L_from_B @ x_test
rhs = np.linalg.norm(B.T @ x_test)**2
print(f'  x^T L x = {lhs:.6f}')
print(f'  ||B^T x||^2 = {rhs:.6f}')
print(f'  Match: {np.isclose(lhs, rhs)}')

Code cell 35

# === 10.3 Directed incidence matrix ===
B_dir = np.zeros((n, m_edges), dtype=float)
for e_idx, (u, v) in enumerate(edges):
    B_dir[u, e_idx] = -1.0  # source (tail)
    B_dir[v, e_idx] = +1.0  # target (head)

print('Directed incidence matrix B_dir:')
print(B_dir.astype(int))

L_dir = B_dir @ B_dir.T
print(f'\nB_dir @ B_dir^T == L: {np.allclose(L_dir, L)}')
print('PASS: L = BB^T holds for directed incidence matrix too')

# Interpretation of B_dir^T x: discrete gradient
x_signal = np.array([0., 1., 2., 3.])  # linear signal
grad = B_dir.T @ x_signal  # edge differences
print(f'\nSignal x = {x_signal}')
print(f'Edge differences B_dir^T x = {grad}')
for e_idx, (u,v) in enumerate(edges):
    print(f'  Edge ({u}→{v}): x[{v}] - x[{u}] = {x_signal[v]-x_signal[u]:.0f} ✓')

11. Representation Conversions

All major conversions and their costs.

Code cell 37

# === 11.1 Conversion functions ===

def edge_list_to_matrix(edge_list, n, directed=False):
    A = np.zeros((n, n), dtype=int)
    for u, v in edge_list:
        A[u, v] = 1
        if not directed: A[v, u] = 1
    return A

def edge_list_to_adj_list(edge_list, n, directed=False):
    adj = {i: [] for i in range(n)}
    for u, v in edge_list:
        adj[u].append(v)
        if not directed: adj[v].append(u)
    return adj

def adj_list_to_coo(adj, n):
    rows, cols = [], []
    for u, nbrs in adj.items():
        for v in nbrs:
            rows.append(u); cols.append(v)
    return np.array(rows, np.int32), np.array(cols, np.int32)

# Test pipeline: edge_list → adj_list → COO → CSR
A_mat = edge_list_to_matrix(edges, n)
adj   = edge_list_to_adj_list(edges, n)
rows, cols = adj_list_to_coo(adj, n)
A_csr_conv = sp.csr_matrix((np.ones(len(rows)), (rows, cols)), shape=(n,n))

print(f'Pipeline: edge_list → matrix → adj_list → COO → CSR')
print(f'Dense matches: {(A_mat == A).all()}')
print(f'CSR matches:   {np.allclose(A_csr_conv.toarray(), A)}')
print('PASS - all conversions consistent')

Code cell 38

# === 11.2 SciPy conversion API ===
print('SciPy format conversions:')
for fmt, converter in [('CSR', lambda x: x.tocsr()),
                        ('CSC', lambda x: x.tocsc()),
                        ('COO', lambda x: x.tocoo()),
                        ('LIL', lambda x: x.tolil())]:
    converted = converter(A_csr)
    dense = converted.toarray()
    ok = np.allclose(dense, A)
    print(f'  → {fmt}: {converted.__class__.__name__}, matches={ok}')

12. Heterogeneous and Temporal Graphs

Heterogeneous graphs have multiple node/edge types, each with its own edge_index.

Code cell 40

# === 12.1 Heterogeneous graph (author-paper-venue) ===
np.random.seed(42)

n_authors, n_papers, n_venues = 5, 8, 3

# Relation: (author, writes, paper)
writes_src = np.array([0, 0, 1, 2, 2, 3, 3, 4], dtype=np.int64)
writes_dst = np.array([0, 1, 2, 3, 4, 5, 6, 7], dtype=np.int64)

# Relation: (paper, published_in, venue)
pub_src = np.arange(n_papers, dtype=np.int64)
pub_dst = np.array([0,0,1,1,2,2,0,1], dtype=np.int64)

print('Heterogeneous graph:')
print(f'  Nodes: {n_authors} authors, {n_papers} papers, {n_venues} venues')
print(f'  Relation (author,writes,paper): {len(writes_src)} edges')
print(f'  Relation (paper,published_in,venue): {len(pub_src)} edges')

# Per-relation adjacency matrices (rectangular)
A_writes = sp.csr_matrix(
    (np.ones(len(writes_src)), (writes_src, writes_dst)),
    shape=(n_authors, n_papers)
)
print(f'\nA_writes shape: {A_writes.shape} (authors x papers)')
print(f'Papers per author: {A_writes.toarray().sum(axis=1).astype(int)}')

Code cell 41

# === 12.2 Temporal edge list ===
# Simulate a temporal graph: edges with timestamps
n_nodes = 6
temporal_edges = [
    (0, 1, 0.0),  # (src, dst, timestamp)
    (1, 2, 1.5),
    (0, 3, 2.0),
    (2, 4, 2.5),
    (1, 4, 3.0),
    (3, 5, 4.0),
    (0, 5, 4.5),
]

# Sort by timestamp (standard for temporal GNNs)
temporal_edges.sort(key=lambda e: e[2])

print('Temporal edge list (sorted by timestamp):')
for u, v, t in temporal_edges:
    print(f'  t={t:.1f}: ({u} → {v})')

# Snapshot at t <= 2.0
snapshot_edges = [(u,v) for u,v,t in temporal_edges if t <= 2.0]
print(f'\nSnapshot at t<=2.0: {snapshot_edges}')
A_snap = edge_list_to_matrix(snapshot_edges, n_nodes, directed=True)
print('Snapshot adjacency:')
print(A_snap)

13. Choosing a Representation

A systematic comparison to guide representation selection.

Code cell 43

# === 13.1 Memory comparison across sizes ===

print(f'{'Graph':>10}  {'n':>8}  {'m':>10}  {'fill':>6}  {'Dense(MB)':>10}  {'CSR(MB)':>8}  {'ratio':>7}')
print('-'*70)

benchmarks = [
    ('Cora',    2708,   5429),
    ('CiteSeer',3327,   4732),
    ('PubMed',  19717,  44338),
    ('ogbn-arxiv',169343, 1166243),
]

for name, nv, mv in benchmarks:
    dense_mb = (nv**2 * 4) / 1e6  # float32
    csr_mb   = ((nv+1 + 2*mv) * 4) / 1e6  # int32 row_ptr + col_idx (symmetric)
    fill     = 2*mv / (nv**2)
    ratio    = dense_mb / csr_mb
    print(f'{name:>10}  {nv:>8,}  {mv:>10,}  {fill:>6.4f}  {dense_mb:>10.1f}  {csr_mb:>8.2f}  {ratio:>6.0f}x')

Code cell 44

# === 13.2 Visualise representation complexity ===
if HAS_MPL:
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    fig.suptitle('Graph Representation Trade-offs', fontsize=14)

    # Left: Memory vs n for different fill ratios
    ns = np.logspace(2, 6, 100)
    ax = axes[0]
    ax.loglog(ns, 4*ns**2/1e6, color=COLORS['error'],    label='Dense float32 ($O(n^2)$)', lw=2)
    ax.loglog(ns, 4*(ns+2*5*ns)/1e6, color=COLORS['primary'],  label='CSR int32, $m=5n$', lw=2)
    ax.loglog(ns, 4*(ns+2*ns)/1e6,   color=COLORS['tertiary'], label='CSR int32, $m=n$', lw=2)
    ax.set_xlabel('Number of vertices $n$'); ax.set_ylabel('Memory (MB)')
    ax.set_title('Memory vs graph size')
    ax.legend(fontsize=9); ax.grid(True, alpha=0.3)

    # Right: Operation complexity comparison
    ops = ['Edge\nquery', 'Neighbour\nenum', 'SpMV', 'Build\nfrom EL']
    dense_ops  = [1, 3, 4, 3]  # relative log-scale
    csr_ops    = [2, 1, 1, 2]
    adjl_ops   = [3, 1, 1, 1]
    x_pos = np.arange(len(ops))
    w = 0.25
    ax2 = axes[1]
    ax2.bar(x_pos-w, dense_ops, w, label='Dense', color=COLORS['error'], alpha=0.8)
    ax2.bar(x_pos,   csr_ops,   w, label='CSR',   color=COLORS['primary'], alpha=0.8)
    ax2.bar(x_pos+w, adjl_ops,  w, label='Adj list', color=COLORS['tertiary'], alpha=0.8)
    ax2.set_xticks(x_pos); ax2.set_xticklabels(ops)
    ax2.set_ylabel('Relative cost (higher = slower)')
    ax2.set_title('Operation complexity (qualitative)')
    ax2.legend(fontsize=9)

    fig.tight_layout()
    plt.savefig('/tmp/graph_repr_tradeoffs.png')
    plt.show()
    print('Plot saved.')

14. Summary

Representation	Space	Edge query	Neighbours	SpMV	Best for
Dense $A$	$O(n^2)$	$O(1)$	$O(n)$	$O(n^2)$	Small dense graphs, spectral ops
Adj. list	$O(n+m)$	$O(\deg)$	$O(\deg)$	$O(m)$	BFS/DFS, dynamic, prototyping
Adj. set	$O(n+m)$	$O(1)$ avg	$O(\deg)$	$O(m)$	Frequent edge queries
COO/edge_index	$O(m)$	$O(m)$	$O(m)$	$O(m\log m)$	PyG, GPU, I/O
CSR	$O(n+m)$	$O(\deg)$	$O(\deg)$	$O(m)$	SpMV, GNN, large graphs
CSC	$O(n+m)$	$O(\deg)$	$O(\deg)$	$O(m)$	Column ops, in-neighbour access
Incidence $B$	$O(nm)$	—	—	—	Hypergraphs, flow, Hodge theory

Key formula: $L = D - A = BB^\top$ with quadratic form $\mathbf{x}^\top L \mathbf{x} = \sum_{\{i,j\}\in E}(x_i-x_j)^2$ .

Key insight: No single representation wins on all operations. Choose based on graph size/density, dominant operation, and hardware target.