Time Series

Exercises Notebook

Converted from exercises.ipynb for web reading.

Time Series — Exercises

10 exercises covering diagnostics, stationarity, AR stability, forecasting, seasonal structure, spectral analysis, and Kalman filtering.

Format	Description
Problem	Markdown cell with the exercise statement
Your Solution	Runnable scaffold cell
Solution	Reference implementation with checks and takeaway

Difficulty Levels

Level	Exercises	Focus
*	1-3	Core diagnostics and linear stability
**	4-6	Forecasting, seasonality, and spectral reasoning
***	7-10	Kalman filtering and ML deployment framing

Additional applied exercises: Exercise 9: Rolling-origin forecast evaluation; Exercise 10: Change-point detection.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import numpy.linalg as la
from scipy import stats

np.set_printoptions(precision=6, suppress=True)
np.random.seed(42)

def header(title):
    print("\n" + "=" * len(title))
    print(title)
    print("=" * len(title))

def check_close(name, got, expected, tol=1e-8):
    ok = np.allclose(got, expected, atol=tol, rtol=tol)
    print(f"{'PASS' if ok else 'FAIL'} - {name}")
    if not ok:
        print("  expected:", expected)
        print("  got     :", got)
    return ok

def check_true(name, cond):
    print(f"{'PASS' if cond else 'FAIL'} - {name}")
    return cond

def sample_acf(x, max_lag):
    x = np.asarray(x, dtype=float)
    x = x - x.mean()
    denom = np.dot(x, x)
    vals = [1.0]
    for h in range(1, max_lag + 1):
        vals.append(np.dot(x[:-h], x[h:]) / denom)
    return np.array(vals)

print("Exercise setup complete.")

Exercise 1 * — White Noise vs Random Walk Diagnostics

Two short series are given:

white = [0.3, -1.1, 0.7, 0.2, -0.4, 1.0]
walk = [0.3, 0.8, 1.2, 1.7, 2.1, 2.8]

Task

(a) Compute the lag-1 sample autocorrelation of each series.
(b) Decide which one looks more random-walk-like.
(c) Explain the decision in one sentence.

Code cell 5

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 6

# Solution
def classify_series(white, walk):
    white_acf1 = sample_acf(white, 1)[1]
    walk_acf1 = sample_acf(walk, 1)[1]
    more_persistent = "walk" if walk_acf1 > white_acf1 else "white"
    explanation = "Random-walk-like behavior is associated with strong positive lag-1 persistence in levels."
    return white_acf1, walk_acf1, more_persistent, explanation

white = np.array([0.3, -1.1, 0.7, 0.2, -0.4, 1.0])
walk = np.array([0.3, 0.8, 1.2, 1.7, 2.1, 2.8])
white_acf1, walk_acf1, more_persistent, explanation = classify_series(white, walk)

header("Exercise 1: White Noise vs Random Walk")
print(f"White lag-1 ACF: {white_acf1:.6f}")
print(f"Walk lag-1 ACF : {walk_acf1:.6f}")
print("More persistent series:", more_persistent)
print("Explanation:", explanation)
check_true("walk is classified as more persistent", more_persistent == "walk")
check_true("explanation is a non-empty string", isinstance(explanation, str) and len(explanation) > 10)
print("\nTakeaway: Random-walk-like behavior is about persistence in levels, not just large values.")

Exercise 2 * — Compute and Interpret ACF

Let x = [2, 3, 5, 4, 6, 7].

Task

(a) Compute the sample autocorrelation at lags 1 and 2.
(b) State whether the series looks positively persistent at short lags.

Code cell 8

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 9

# Solution
def acf_lags(x):
    x = np.asarray(x, dtype=float)
    rho = sample_acf(x, 2)
    rho1 = rho[1]
    rho2 = rho[2]
    positive_persistence = rho1 > 0 and rho2 > 0
    return rho1, rho2, positive_persistence

rho1, rho2, positive_persistence = acf_lags([2, 3, 5, 4, 6, 7])

header("Exercise 2: Short-Lag ACF")
print(f"rho(1): {rho1:.6f}")
print(f"rho(2): {rho2:.6f}")
print("Positive persistence:", positive_persistence)
check_true("lag-1 ACF is positive", rho1 > 0)
check_true("persistence flag is boolean", isinstance(positive_persistence, (bool, np.bool_)))
print("\nTakeaway: Positive short-lag autocorrelation is the simplest signature of persistence.")

Exercise 3 * — AR(2) Stability from Characteristic Roots

Consider the AR(2) model X_t = 1.1 X_{t-1} - 0.3 X_{t-2} + epsilon_t.

Task

(a) Form the characteristic polynomial.
(b) Compute its roots.
(c) Decide whether the process is stationary.

Code cell 11

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 12

# Solution
def ar2_stationary(phi1, phi2):
    roots = np.roots(np.array([-phi2, -phi1, 1.0]))
    stationary = np.all(np.abs(roots) > 1)
    return roots, stationary

roots, stationary = ar2_stationary(1.1, -0.3)

header("Exercise 3: AR(2) Stability")
print("Roots:", roots)
print("Magnitudes:", np.abs(roots))
print("Stationary:", stationary)
check_true("root magnitudes exceed one", np.all(np.abs(roots) > 1))
check_true("stationary flag is True", stationary is True or stationary == True)
print("\nTakeaway: Stationarity of AR models is a root-location condition, not a guess from the line plot alone.")

Exercise 4 ** — One-Step and Multi-Step Forecasting

For the AR(1) model X_t = 0.8 X_{t-1} + epsilon_t, suppose the latest observed value is X_T = 2.5.

Task

(a) Compute the 1-step, 2-step, and 3-step forecast means.
(b) Verify that the forecast magnitude decays toward zero.

Code cell 14

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 15

# Solution
def ar1_forecasts(phi, xT, horizons):
    horizons = np.asarray(horizons, dtype=int)
    means = phi ** horizons * xT
    return means

means = ar1_forecasts(0.8, 2.5, np.array([1, 2, 3]))

header("Exercise 4: AR(1) Forecast Means")
print("Forecast means:", means)
check_close("1-step forecast", means[0], 2.0)
check_true("forecast magnitudes decay", abs(means[2]) < abs(means[1]) < abs(means[0]))
print("\nTakeaway: In a stationary AR(1), multi-step forecasts revert toward the long-run mean.")

Exercise 5 ** — Seasonal Differencing

Let x = [10, 12, 9, 11, 13, 10, 12, 14] and assume a seasonal period s = 4.

Task

(a) Compute the seasonal difference (1 - L^4)x_t.
(b) Explain what repeated pattern the operation removes.

Code cell 17

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 18

# Solution
def seasonal_difference(x, s):
    x = np.asarray(x, dtype=float)
    diff = x[s:] - x[:-s]
    explanation = "Seasonal differencing compares each point to the corresponding point one full seasonal cycle earlier."
    return diff, explanation

diff, explanation = seasonal_difference([10, 12, 9, 11, 13, 10, 12, 14], 4)

header("Exercise 5: Seasonal Differencing")
print("Seasonal difference:", diff)
print("Explanation:", explanation)
check_close("seasonal differences", diff, np.array([3., -2., 3., 3.]))
check_true("explanation mentions seasonal cycle", "cycle" in explanation.lower())
print("\nTakeaway: Seasonal differencing removes repeating level patterns at a chosen lag.")

Exercise 6 ** — Periodogram-Based Frequency Detection

A signal is observed for n=60 time points: x_t = sin(2*pi*t/12) + noise.

Task

(a) Compute the dominant Fourier frequency.
(b) Convert it to an estimated period.

Code cell 20

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 21

# Solution
def dominant_period(x):
    x = np.asarray(x, dtype=float)
    x = x - x.mean()
    freqs = np.fft.rfftfreq(len(x), d=1.0)
    power = np.abs(np.fft.rfft(x))**2 / len(x)
    idx = np.argmax(power[1:]) + 1
    freq = freqs[idx]
    period = 1 / freq
    return freq, period

t = np.arange(60)
x = np.sin(2 * np.pi * t / 12) + 0.1 * np.random.randn(60)
freq, period = dominant_period(x)

header("Exercise 6: Dominant Period from FFT")
print(f"Dominant frequency: {freq:.6f}")
print(f"Estimated period  : {period:.6f}")
check_true("frequency is positive", freq > 0)
check_true("estimated period is near 12", abs(period - 12) < 1.5)
print("\nTakeaway: Frequency-domain analysis makes periodic structure easy to detect even in noisy data.")

Exercise 7 *** — Kalman Predict and Update

Consider the 1D local-level model z_t = z_{t-1} + w_t, w_t ~ N(0, q) and x_t = z_t + v_t, v_t ~ N(0, r).

Suppose the prior at time t-1 is mean m=1.0, variance p=0.5, with q=0.2, r=0.3, and an observation x_t = 1.4.

Task

(a) Compute the predictive variance.
(b) Compute the Kalman gain.
(c) Compute the updated posterior mean and variance.

Code cell 23

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 24

# Solution
def kalman_update_1d(m, p, q, r, y):
    p_pred = p + q
    k = p_pred / (p_pred + r)
    m_post = m + k * (y - m)
    p_post = (1 - k) * p_pred
    return p_pred, k, m_post, p_post

p_pred, k, m_post, p_post = kalman_update_1d(1.0, 0.5, 0.2, 0.3, 1.4)

header("Exercise 7: Kalman Predict/Update")
print(f"Predictive variance: {p_pred:.6f}")
print(f"Kalman gain       : {k:.6f}")
print(f"Posterior mean    : {m_post:.6f}")
print(f"Posterior variance: {p_post:.6f}")
check_close("predictive variance", p_pred, 0.7)
check_close("Kalman gain", k, 0.7)
check_close("posterior mean", m_post, 1.28)
print("\nTakeaway: The Kalman gain balances trust in the model prediction against trust in the new measurement.")

Exercise 8 *** — Time-Series Modeling in an ML Deployment

You manage hourly inference traffic for a model-serving system. The series shows a strong 24-hour cycle, occasional holiday shifts, and rare abrupt demand jumps after product launches.

Task

(a) Name one classical baseline you would start with.
(b) Name one diagnostic you would inspect before scaling up to a larger model.
(c) Name one failure mode that would justify moving beyond a simple seasonal model.

Code cell 26

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 27

# Solution
def deployment_plan():
    baseline = "Seasonal ARIMA or a seasonal naive baseline"
    diagnostic = "Inspect the ACF/PACF and residual seasonality after removing the daily cycle"
    failure_mode = "Structural breaks or changing seasonal shape after launches"
    return baseline, diagnostic, failure_mode

baseline, diagnostic, failure_mode = deployment_plan()

header("Exercise 8: Deployment Modeling Plan")
print("Baseline     :", baseline)
print("Diagnostic   :", diagnostic)
print("Failure mode :", failure_mode)
check_true("baseline mentions seasonality", "season" in baseline.lower())
check_true("failure mode mentions changing structure", "break" in failure_mode.lower() or "changing" in failure_mode.lower())
print("\nTakeaway: In production forecasting, the first question is usually about stable seasonal structure and the second is about when that structure breaks.")

Exercise 9 *** - Rolling-Origin Forecast Evaluation

A forecaster should be evaluated in time order, not by random shuffling.

Part (a): Implement rolling-origin one-step forecasts for an AR(1) baseline.

Part (b): Compute MAE and RMSE across forecast origins.

Part (c): Compare with a leakage-prone random split baseline.

Part (d): Explain why temporal validation protects production forecasts.

Code cell 29

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 30

# Solution
n = 240
phi = 0.75
x = np.zeros(n)
noise = np.random.normal(0, 1, n)
for t in range(1, n):
    x[t] = phi * x[t-1] + noise[t]

def fit_phi(series):
    return np.dot(series[:-1], series[1:]) / np.dot(series[:-1], series[:-1])

preds, actuals = [], []
for origin in range(60, n-1):
    phi_hat = fit_phi(x[:origin+1])
    preds.append(phi_hat * x[origin])
    actuals.append(x[origin+1])
preds = np.array(preds)
actuals = np.array(actuals)
errors = preds - actuals

header('Exercise 9: Rolling-Origin Forecast Evaluation')
print(f'rolling MAE={np.mean(np.abs(errors)):.4f}')
print(f'rolling RMSE={np.sqrt(np.mean(errors**2)):.4f}')
print(f'final phi_hat={fit_phi(x):.3f}, true phi={phi:.3f}')
check_true('Forecast arrays are aligned', len(preds) == len(actuals))
check_true('AR(1) fit recovers positive persistence', fit_phi(x) > 0.5)
print('Takeaway: rolling-origin evaluation respects what was knowable at forecast time.')

Exercise 10 *** - Change-Point Detection with Rolling Statistics

A service latency series has a mean shift after deployment.

Part (a): Simulate a shifted time series.

Part (b): Compute rolling mean and rolling standard deviation.

Part (c): Trigger a simple change alert when the rolling mean exceeds a baseline band.

Part (d): Explain why sequential monitoring is different from iid anomaly detection.

Code cell 32

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 33

# Solution
n = 300
x = np.concatenate([np.random.normal(0, 1, 160), np.random.normal(1.2, 1, 140)])
window = 30
rolling_mean = np.array([x[i-window:i].mean() for i in range(window, n+1)])
rolling_std = np.array([x[i-window:i].std(ddof=1) for i in range(window, n+1)])
baseline = x[:100]
center = baseline.mean()
band = center + 3 * baseline.std(ddof=1) / np.sqrt(window)
alerts = np.where(rolling_mean > band)[0] + window
first_alert = alerts[0] if len(alerts) else None

header('Exercise 10: Change-Point Detection')
print(f'baseline rolling-mean band upper={band:.3f}')
print(f'first alert index={first_alert}')
print(f'post-change mean={x[160:].mean():.3f}, pre-change mean={x[:160].mean():.3f}')
check_true('Change is detected after the deployment point', first_alert is not None and first_alert >= 160)
print('Takeaway: time-series monitoring uses ordered evidence, so alert timing matters as much as alert existence.')