Causal InferenceMath for LLMs

Causal Inference

Causal Inference

Exercises Notebook

Converted from exercises.ipynb for web reading.

Exercises: Structural Causal Models

There are 10 exercises. Exercises 1-3 are mechanics, 4-6 are theory, and 7-10 connect causal inference to AI systems.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3


import itertools
import math

COLORS = {
    "primary":   "#0077BB",
    "secondary": "#EE7733",
    "tertiary":  "#009988",
    "error":     "#CC3311",
    "neutral":   "#555555",
    "highlight": "#EE3377",
}

def header(title):
    print("\n" + "=" * 72)
    print(title)
    print("=" * 72)

def check_true(condition, name):
    ok = bool(condition)
    print(f"{'PASS' if ok else 'FAIL'} - {name}")
    assert ok, name

def check_close(value, target, tol=1e-8, name="value"):
    ok = abs(float(value) - float(target)) <= tol
    print(f"{'PASS' if ok else 'FAIL'} - {name}: got {float(value):.6f}, expected {float(target):.6f}")
    assert ok, name

def sigmoid(z):
    return 1.0 / (1.0 + np.exp(-z))

def simulate_confounding(n=1000):
    z = np.random.binomial(1, 0.5, size=n)
    x = np.random.binomial(1, 0.2 + 0.6 * z)
    y = 1.0 + 2.0 * x + 3.0 * z + np.random.normal(0, 0.2, size=n)
    return z, x, y

def backdoor_adjustment(z, x, y):
    effect = 0.0
    for val in [0, 1]:
        mask_z = z == val
        p_z = np.mean(mask_z)
        y1 = np.mean(y[mask_z & (x == 1)])
        y0 = np.mean(y[mask_z & (x == 0)])
        effect += (y1 - y0) * p_z
    return float(effect)

def ate(y1, y0):
    return float(np.mean(y1 - y0))

def shd(adj_true, adj_hat):
    return int(np.sum(np.asarray(adj_true) != np.asarray(adj_hat)))

def acyclicity_value(w):
    eigvals = np.linalg.eigvals(np.asarray(w) * np.asarray(w))
    return float(np.sum(np.exp(eigvals)).real - w.shape[0])

print("Helper functions ready.")

Exercise 1: correlation vs causation (*)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 5

# Your Solution - Exercise 1
answer = None
print("Your answer placeholder:", answer)

Code cell 6

# Solution
header("Exercise 1: Structural Causal Models")
z, x, y = simulate_confounding(n=1000)
adjusted = backdoor_adjustment(z, x, y)
check_true(abs(adjusted - 2.0) < 0.2, "adjusted effect is close to true effect")
print("Adjusted effect:", round(adjusted, 3))
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 2: mechanisms as stable assignments (*)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 8

# Your Solution - Exercise 2
answer = None
print("Your answer placeholder:", answer)

Code cell 9

# Solution
header("Exercise 2: Structural Causal Models")
y0 = np.array([1.0, 2.0, 3.0, 4.0])
y1 = y0 + np.array([0.5, 0.7, 0.9, 1.1])
effect = ate(y1, y0)
check_close(effect, 0.8, name="ATE")
print("ATE:", effect)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 3: DAGs as causal assumptions (*)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 11

# Your Solution - Exercise 3
answer = None
print("Your answer placeholder:", answer)

Code cell 12

# Solution
header("Exercise 3: Structural Causal Models")
true_adj = np.array([[0, 1], [0, 0]])
pred_adj = np.array([[0, 0], [0, 0]])
distance = shd(true_adj, pred_adj)
check_true(distance == 1, "one missing edge")
print("SHD:", distance)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 4: interventions as model surgery (**)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 14

# Your Solution - Exercise 4
answer = None
print("Your answer placeholder:", answer)

Code cell 15

# Solution
header("Exercise 4: Structural Causal Models")
w = np.array([[0.0, 1.0], [0.0, 0.0]])
h = acyclicity_value(w)
check_close(h, 0.0, tol=1e-8, name="DAG acyclicity")
print("h(W):", h)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 5: why SCMs matter for ML distribution shift (**)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 17

# Your Solution - Exercise 5
answer = None
print("Your answer placeholder:", answer)

Code cell 18

# Solution
header("Exercise 5: Structural Causal Models")
x = np.array([0, 0, 1, 1])
y = np.array([1.0, 1.2, 2.9, 3.1])
diff = float(np.mean(y[x == 1]) - np.mean(y[x == 0]))
check_close(diff, 1.9, name="difference in means")
print("Difference:", diff)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 6: exogenous variables U\mathbf{U} (**)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 20

# Your Solution - Exercise 6
answer = None
print("Your answer placeholder:", answer)

Code cell 21

# Solution
header("Exercise 6: Structural Causal Models")
z, x, y = simulate_confounding(n=1000)
adjusted = backdoor_adjustment(z, x, y)
check_true(abs(adjusted - 2.0) < 0.2, "adjusted effect is close to true effect")
print("Adjusted effect:", round(adjusted, 3))
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 7: endogenous variables V\mathbf{V} (***)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 23

# Your Solution - Exercise 7
answer = None
print("Your answer placeholder:", answer)

Code cell 24

# Solution
header("Exercise 7: Structural Causal Models")
y0 = np.array([1.0, 2.0, 3.0, 4.0])
y1 = y0 + np.array([0.5, 0.7, 0.9, 1.1])
effect = ate(y1, y0)
check_close(effect, 0.8, name="ATE")
print("ATE:", effect)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 8: structural assignments Vi=fi(pai,Ui)V_i=f_i(\operatorname{pa}_i,U_i) (***)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 26

# Your Solution - Exercise 8
answer = None
print("Your answer placeholder:", answer)

Code cell 27

# Solution
header("Exercise 8: Structural Causal Models")
true_adj = np.array([[0, 1], [0, 0]])
pred_adj = np.array([[0, 0], [0, 0]])
distance = shd(true_adj, pred_adj)
check_true(distance == 1, "one missing edge")
print("SHD:", distance)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 9: causal graph GG (***)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 29

# Your Solution - Exercise 9
answer = None
print("Your answer placeholder:", answer)

Code cell 30

# Solution
header("Exercise 9: Structural Causal Models")
w = np.array([[0.0, 1.0], [0.0, 0.0]])
h = acyclicity_value(w)
check_close(h, 0.0, tol=1e-8, name="DAG acyclicity")
print("h(W):", h)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")

Exercise 10: observational interventional and counterfactual distributions (***)

State the causal query, identify assumptions, compute a small quantity, and interpret the result.

Code cell 32

# Your Solution - Exercise 10
answer = None
print("Your answer placeholder:", answer)

Code cell 33

# Solution
header("Exercise 10: Structural Causal Models")
x = np.array([0, 0, 1, 1])
y = np.array([1.0, 1.2, 2.9, 3.1])
diff = float(np.mean(y[x == 1]) - np.mean(y[x == 0]))
check_close(diff, 1.9, name="difference in means")
print("Difference:", diff)
print("\nTakeaway: causal calculations require an explicit query, graph assumptions, and an estimand before estimation.")
PreviousNext