Convolution Theorem

Exercises Notebook

Converted from exercises.ipynb for web reading.

§20-04 Convolution Theorem — Exercises

"Convolution in one domain equals pointwise multiplication in the other."

Eight exercises from direct application to research-level problems. Difficulty: ★ (foundational) → ★★ (intermediate) → ★★★ (advanced).

Prerequisites: §20-02 Fourier Transform, §20-03 DFT and FFT, §20-04 theory notebook.

Code cell 2

import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl

try:
    import seaborn as sns
    sns.set_theme(style="whitegrid", palette="colorblind")
    HAS_SNS = True
except ImportError:
    plt.style.use("seaborn-v0_8-whitegrid")
    HAS_SNS = False

mpl.rcParams.update({
    "figure.figsize":    (10, 6),
    "figure.dpi":         120,
    "font.size":           13,
    "axes.titlesize":      15,
    "axes.labelsize":      13,
    "xtick.labelsize":     11,
    "ytick.labelsize":     11,
    "legend.fontsize":     11,
    "legend.framealpha":   0.85,
    "lines.linewidth":      2.0,
    "axes.spines.top":     False,
    "axes.spines.right":   False,
    "savefig.bbox":       "tight",
    "savefig.dpi":         150,
})
np.random.seed(42)
print("Plot setup complete.")

Code cell 3

import numpy as np
import scipy.signal as sig
from scipy.signal import fftconvolve, firwin, freqz

try:
    import matplotlib.pyplot as plt
    import matplotlib
    plt.style.use('seaborn-v0_8-whitegrid')
    plt.rcParams['figure.figsize'] = [10, 5]
    plt.rcParams['font.size'] = 12
    HAS_MPL = True
except ImportError:
    HAS_MPL = False

np.set_printoptions(precision=6, suppress=True)
np.random.seed(42)

COLORS = {
    'primary':   '#0077BB',
    'secondary': '#EE7733',
    'tertiary':  '#009988',
    'error':     '#CC3311',
    'neutral':   '#555555',
    'highlight': '#EE3377',
}

print('Setup complete. All libraries loaded.')

Exercise 1 ★ — Circular vs. Linear Convolution

Objective: Understand when circular convolution equals linear convolution.

Given two sequences:

x = [1, 2, 3, 4], \qquad h = [1, -1, 1]

(a) Compute the linear convolution $y_{\text{lin}} = x * h$ by hand (or with np.convolve).

(b) Compute the 8-point circular convolution $y_{\text{circ}} = x \circledast_8 h$ using DFT.

(c) Show that $y_{\text{circ}} = y_{\text{lin}}$ when both are zero-padded to the same length $N \geq \text{len}(x) + \text{len}(h) - 1$ .

(d) What is the minimum FFT size $N$ that makes circular convolution exact for these sequences? Justify.

(e) Identify the indices where circular and linear convolution differ when $N = 4$ (same length as $x$ ). Explain the 'wrap-around' artifact.

Code cell 5

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 6

# Solution
# Exercise 1: Circular vs. Linear Convolution

x = np.array([1, 2, 3, 4], dtype=float)
h = np.array([1, -1, 1], dtype=float)

# (a) Linear convolution
y_lin = np.convolve(x, h)  # length = len(x)+len(h)-1 = 6
print('(a) Linear convolution y_lin =', y_lin)
print(f'    Length: {len(y_lin)}')

# (b) Circular convolution via DFT (N=8)
N_circ = 8
x_pad = np.zeros(N_circ); x_pad[:len(x)] = x
h_pad = np.zeros(N_circ); h_pad[:len(h)] = h
y_circ8 = np.fft.ifft(np.fft.fft(x_pad) * np.fft.fft(h_pad)).real
print('\n(b) 8-point circular convolution:', np.round(y_circ8, 6))

# (c) Verify equality with zero-padding
y_lin_pad = np.zeros(N_circ)
y_lin_pad[:len(y_lin)] = y_lin
err = np.max(np.abs(y_circ8 - y_lin_pad))
print(f'\n(c) Max difference (N=8): {err:.2e}')
ok = err < 1e-10
print(f'    PASS: circular == linear (zero-padded)? {ok}')

# (d) Minimum N
N_min = len(x) + len(h) - 1
print(f'\n(d) Minimum N = len(x)+len(h)-1 = {len(x)}+{len(h)}-1 = {N_min}')
print(f'    Next power of 2 >= {N_min}: {2**int(np.ceil(np.log2(N_min)))}')

# (e) Wrap-around with N=4
N4 = 4
x4 = x[:N4]
h4 = np.zeros(N4); h4[:len(h)] = h
y_circ4 = np.fft.ifft(np.fft.fft(x4) * np.fft.fft(h4)).real
print(f'\n(e) 4-point circular convolution: {np.round(y_circ4, 4)}')
print(f'    Linear conv first 4 values:    {y_lin[:4]}')
print(f'    Wrap-around at index 0: {y_lin[0]} + {y_lin[4]} = {y_lin[0]+y_lin[4]} (circular adds tail back)')
print(f'    Wrap-around at index 1: {y_lin[1]} + {y_lin[5]} = {y_lin[1]+y_lin[5]}')

Exercise 2 ★ — Convolution Theorem Proof (Discrete)

Objective: Derive the DFT convolution theorem from first principles.

(a) Starting from the definition of the DFT $X[k] = \sum_{n=0}^{N-1} x[n]\, e^{-j2\pi kn/N}$ , show algebraically that:

\text{DFT}\{x \circledast_N h\}[k] = X[k] \cdot H[k]

Hint: substitute the definition of circular convolution and exchange summation order.

(b) Numerically verify this identity for $N=16$ , $x$ and $h$ random real sequences. Print the maximum error.

(c) State and verify the dual theorem: $\text{DFT}\{x[n]h[n]\}[k] = (X \circledast_N H)[k]/N$ .

(d) Use the convolution theorem to compute the autocorrelation $R_{xx}[m] = \sum_n x[n]x[n+m]$ in $O(N\log N)$ and verify against a direct $O(N^2)$ loop.

Code cell 8

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 9

# Solution
# Exercise 2: Convolution Theorem Verification

N = 16
np.random.seed(7)
x = np.random.randn(N)
h = np.random.randn(N)
X = np.fft.fft(x)
H = np.fft.fft(h)

# (b) Circular convolution both ways
y_time = np.array([sum(x[k]*h[(n-k)%N] for k in range(N)) for n in range(N)])
y_freq = np.fft.ifft(X * H).real
err_b = np.max(np.abs(np.fft.fft(y_time) - X*H))
print('(b) DFT{x*h} == X·H ?')
print(f'    Max error: {err_b:.2e}')
print(f'    PASS: {err_b < 1e-10}')

# (c) Dual theorem: DFT{x·h} = (X circconv H)/N
prod_dft = np.fft.fft(x * h)
X_conv_H = np.array([sum(X[m]*H[(k-m)%N] for m in range(N)) for k in range(N)])
err_c = np.max(np.abs(prod_dft - X_conv_H/N))
print(f'\n(c) Dual theorem DFT{{x·h}} = X⊛H/N:')
print(f'    Max error: {err_c:.2e}')
print(f'    PASS: {err_c < 1e-10}')

# (d) Autocorrelation via FFT
# R_xx[m] = sum_n x[n] x[n+m]  (cyclic)
R_fft = np.fft.ifft(np.abs(X)**2).real

R_direct = np.array([sum(x[n]*x[(n+m)%N] for n in range(N)) for m in range(N)])

err_d = np.max(np.abs(R_fft - R_direct))
print(f'\n(d) Autocorrelation FFT vs direct loop:')
print(f'    R_fft[0] = {R_fft[0]:.4f}  (should equal ||x||² = {np.dot(x,x):.4f})')
print(f'    Max error: {err_d:.2e}')
print(f'    PASS: {err_d < 1e-10}')

Exercise 3 ★ — FIR Filter Design and Frequency Response

Objective: Design FIR filters and analyze their frequency responses.

(a) Design a low-pass FIR filter with cutoff frequency $f_c = 0.2$ (normalized, $f_c \in [0,1]$ , Nyquist = 1) and 51 taps using scipy.signal.firwin. Plot its frequency response.

(b) Design a high-pass FIR filter with the same cutoff by spectral inversion: $h_{HP}[n] = \delta[n - (M-1)/2] - h_{LP}[n]$ . Verify the frequency response.

(c) Cascade the two filters: $h_{\text{cascade}} = h_{LP} * h_{HP}$ . What is the frequency response of the cascade? Explain using the convolution theorem.

(d) Compute the group delay of the low-pass filter and verify it is constant (linear phase property of symmetric FIR filters). What is the delay in samples?

(e) Apply the low-pass filter to a signal $x[n] = \sin(2\pi \cdot 0.1 n) + \sin(2\pi \cdot 0.4 n)$ for $n=0,\ldots,255$ and show that the 0.4 Hz component is attenuated.

Code cell 11

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 12

# Solution
# Exercise 3: FIR Filter Design

from scipy.signal import firwin, freqz, group_delay

M = 51  # filter length (odd → symmetric)
fc = 0.2  # cutoff (normalized, 0..1 where 1=Nyquist)

# (a) Low-pass FIR
h_lp = firwin(M, fc)  # Hamming window by default
w_lp, H_lp = freqz(h_lp, worN=1024)
f_norm = w_lp / np.pi  # 0..1

print(f'(a) Low-pass FIR: {M} taps, cutoff={fc}')
print(f'    Passband gain at f=0.1: {20*np.log10(np.abs(np.interp(0.1*np.pi, w_lp, np.abs(H_lp)))):.1f} dB')
print(f'    Stopband gain at f=0.4: {20*np.log10(np.abs(np.interp(0.4*np.pi, w_lp, np.abs(H_lp)))):.1f} dB')

# (b) High-pass via spectral inversion
delta = np.zeros(M)
delta[(M-1)//2] = 1.0
h_hp = delta - h_lp
w_hp, H_hp = freqz(h_hp, worN=1024)

print(f'\n(b) High-pass FIR (spectral inversion):')
print(f'    Stopband gain at f=0.1: {20*np.log10(np.abs(np.interp(0.1*np.pi, w_hp, np.abs(H_hp)))):.1f} dB')
print(f'    Passband gain at f=0.4: {20*np.log10(np.abs(np.interp(0.4*np.pi, w_hp, np.abs(H_hp)))):.1f} dB')

# (c) Cascade = convolution of impulse responses
h_cascade = np.convolve(h_lp, h_hp)
w_c, H_c = freqz(h_cascade, worN=1024)
print(f'\n(c) Cascade = LP * HP (length={len(h_cascade)})')
print(f'    Max gain anywhere: {np.max(np.abs(H_c)):.4f} (should be ~0, all-stop filter)')
# Convolution theorem: H_cascade = H_LP * H_HP
H_product = H_lp * H_hp
err_c = np.max(np.abs(H_c - H_product))
print(f'    PASS convolution theorem: {err_c < 1e-6}')

# (d) Group delay
w_gd, gd = group_delay((h_lp, 1), w=1024)
gd_passband = gd[w_gd < 0.2*np.pi]
print(f'\n(d) Group delay (passband mean): {np.mean(gd_passband):.2f} samples')
print(f'    Expected: (M-1)/2 = {(M-1)//2} samples')
ok_d = abs(np.mean(gd_passband) - (M-1)//2) < 0.5
print(f'    PASS constant group delay: {ok_d}')

# (e) Apply LP filter to mixed signal
n = np.arange(256)
x_mixed = np.sin(2*np.pi*0.1*n) + np.sin(2*np.pi*0.4*n)
from scipy.signal import lfilter
y_filtered = lfilter(h_lp, 1.0, x_mixed)

if HAS_MPL:
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    axes[0].plot(f_norm, 20*np.log10(np.abs(H_lp)+1e-10),
                 color=COLORS['primary'], lw=2, label='LP')
    axes[0].plot(f_norm, 20*np.log10(np.abs(H_hp)+1e-10),
                 color=COLORS['secondary'], lw=2, label='HP')
    axes[0].axvline(fc, color='gray', ls='--', label=f'fc={fc}')
    axes[0].set_ylim(-80, 5)
    axes[0].set_title('FIR Frequency Responses')
    axes[0].set_xlabel('Normalized frequency')
    axes[0].set_ylabel('|H(f)| dB')
    axes[0].legend()

    axes[1].plot(n, x_mixed, color=COLORS['neutral'], alpha=0.4, label='Input')
    axes[1].plot(n, y_filtered, color=COLORS['primary'], lw=2, label='LP filtered')
    axes[1].set_title('Low-Pass Filter Applied')
    axes[1].set_xlabel('Sample')
    axes[1].legend()

    plt.tight_layout()
    plt.savefig('/tmp/ex3_fir.png', dpi=100, bbox_inches='tight')
    plt.show()

print(f'\n(e) Input power: {np.var(x_mixed):.3f}')
print(f'    Output power: {np.var(y_filtered[M:]):.3f} (after filter settles)')
print('    0.4 Hz component successfully attenuated.')

Exercise 4 ★★ — Overlap-Add Algorithm

Objective: Implement and verify the overlap-add method for fast long convolution.

(a) Implement overlap_add(x, h, block_size) that convolves long signal x with short kernel h using block FFTs. Each block has size block_size, FFT size = next power of 2 above block_size + len(h) - 1.

(b) Verify correctness: compare overlap_add(x, h) with np.convolve(x, h) for len(x) = 1000, len(h) = 50. Print max error (should be < 1e-8).

(c) Benchmark: time overlap-add vs. np.convolve vs. scipy.signal.fftconvolve for len(x) = 100000, len(h) = 100. Report which is fastest and explain why.

(d) What is the optimal block_size that minimizes the number of FFT operations per output sample? Show the formula and confirm numerically.

Code cell 14

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 15

# Solution
# Exercise 4: Overlap-Add Algorithm

import time

# (a) Implementation
def overlap_add(x, h, block_size=None):
    """Overlap-add method for long convolution."""
    L = len(h)
    if block_size is None:
        block_size = 4 * L  # default: 4x filter length
    N_fft = 2**int(np.ceil(np.log2(block_size + L - 1)))
    H_fft = np.fft.rfft(h, n=N_fft)
    n_out = len(x) + L - 1
    y = np.zeros(n_out)
    n_blocks = int(np.ceil(len(x) / block_size))
    for m in range(n_blocks):
        block = x[m*block_size:(m+1)*block_size]
        X_block = np.fft.rfft(block, n=N_fft)
        y_block = np.fft.irfft(X_block * H_fft, n=N_fft)
        start = m * block_size
        end = min(start + N_fft, n_out)
        y[start:end] += y_block[:end-start]
    return y

# (b) Correctness
np.random.seed(42)
x_long = np.random.randn(1000)
h_short = np.random.randn(50)

y_oa = overlap_add(x_long, h_short, block_size=200)
y_ref = np.convolve(x_long, h_short)
err_b = np.max(np.abs(y_oa - y_ref))
print(f'(b) Max error vs np.convolve: {err_b:.2e}')
print(f'    PASS: {err_b < 1e-8}')

# (c) Benchmark
x_big = np.random.randn(100000)
h_med = np.random.randn(100)

N_trials = 3

t0 = time.perf_counter()
for _ in range(N_trials): np.convolve(x_big, h_med)
t_direct = (time.perf_counter() - t0) / N_trials

t0 = time.perf_counter()
for _ in range(N_trials): fftconvolve(x_big, h_med)
t_fft = (time.perf_counter() - t0) / N_trials

t0 = time.perf_counter()
for _ in range(N_trials): overlap_add(x_big, h_med, block_size=400)
t_oa = (time.perf_counter() - t0) / N_trials

print(f'\n(c) Timing for len(x)=100000, len(h)=100:')
print(f'    np.convolve:    {t_direct*1000:.1f} ms')
print(f'    fftconvolve:    {t_fft*1000:.1f} ms')
print(f'    overlap_add:    {t_oa*1000:.1f} ms')

# (d) Optimal block size
# Cost per output sample ≈ (N_fft * log2(N_fft)) / block_size
# Minimizing over block_size: optimal B ≈ sqrt(L * log(L)) or empirically ~4L..8L
L_h = 100
block_sizes = [50, 100, 200, 400, 800, 1600, 3200]
costs = []
print(f'\n(d) Cost analysis (L={L_h}):')
print(f'    block_size  N_fft  cost_per_sample')
for bs in block_sizes:
    nfft = 2**int(np.ceil(np.log2(bs + L_h - 1)))
    cost = (nfft * np.log2(nfft)) / bs
    costs.append(cost)
    print(f'    {bs:10d} {nfft:6d} {cost:15.2f}')
opt_bs = block_sizes[np.argmin(costs)]
print(f'    Optimal block size: {opt_bs} ≈ {opt_bs/L_h:.0f}×L')

Exercise 5 ★★ — Wiener Filter for Signal Denoising

Objective: Implement the Wiener deconvolution filter and analyze the bias-variance tradeoff.

A signal $x[n] = \cos(2\pi f_0 n/N)$ is blurred by a Gaussian kernel $h$ and corrupted by noise:

y[n] = (h * x)[n] + \sigma \eta[n], \quad \eta \sim \mathcal{N}(0,1)

(a) Implement wiener_filter(Y, H, snr) that returns $H_W(f) = H^*(f)/(|H(f)|^2 + 1/\text{snr})$ .

(b) Apply to reconstruct $x$ from $y$ at SNR = 10 dB and SNR = 0 dB. Plot original, blurred+noisy, and Wiener output for both cases.

(c) Plot MSE vs. SNR for SNR ranging from -10 dB to 30 dB. At what SNR does the Wiener filter give less than 10% MSE relative error?

(d) Compare Wiener to a simple threshold filter: $H_T(f) = 1/H(f)$ if $|H(f)| > \tau$ , else $0$ . For which SNR regime is each better?

(e) Prove (algebraically) that the Wiener filter minimizes $E[|X(f) - \hat{X}(f)|^2]$ over all linear frequency-domain estimators $\hat{X}(f) = G(f)Y(f)$ .

Code cell 17

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 18

# Solution
# Exercise 5: Wiener Filter

# (a) Wiener filter implementation
def wiener_filter(Y, H, snr):
    """Wiener deconvolution filter.
    H_W(f) = H*(f) / (|H(f)|^2 + 1/snr)
    """
    lam = 1.0 / snr
    H_W = np.conj(H) / (np.abs(H)**2 + lam)
    return np.fft.ifft(H_W * Y).real

N = 256
n = np.arange(N)
f0 = 0.1
x_true = np.cos(2*np.pi*f0*n)

# Gaussian blur kernel
k_len = 21
k = np.arange(-k_len//2, k_len//2+1)
h_gauss = np.exp(-k**2 / (2*4.0**2))
h_gauss /= h_gauss.sum()
H_gauss = np.fft.fft(h_gauss, n=N)
X_true = np.fft.fft(x_true)
y_clean = np.fft.ifft(H_gauss * X_true).real

# (b) Compare SNR=10dB and SNR=0dB
np.random.seed(42)
fig, axes = plt.subplots(2, 3, figsize=(14, 7))

for row, snr_db in enumerate([10, 0]):
    snr_lin = 10**(snr_db/10)
    noise_std = np.sqrt(np.var(x_true) / snr_lin)
    y_noisy = y_clean + noise_std * np.random.randn(N)
    Y_noisy = np.fft.fft(y_noisy)

    x_hat = wiener_filter(Y_noisy, H_gauss, snr_lin)
    mse = np.mean((x_hat - x_true)**2)

    axes[row,0].plot(n[:80], x_true[:80], color=COLORS['primary'], lw=2)
    axes[row,0].set_title(f'True Signal (SNR={snr_db}dB)')

    axes[row,1].plot(n[:80], y_noisy[:80], color=COLORS['neutral'], alpha=0.7)
    axes[row,1].set_title('Blurred + Noisy')

    axes[row,2].plot(n[:80], x_hat[:80], color=COLORS['tertiary'], lw=2)
    axes[row,2].plot(n[:80], x_true[:80], color=COLORS['primary'], lw=1, ls='--', alpha=0.5)
    axes[row,2].set_title(f'Wiener Output (MSE={mse:.4f})')

plt.suptitle('Wiener Filter at Two SNR Levels', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.savefig('/tmp/ex5_wiener.png', dpi=100, bbox_inches='tight')
plt.show()

# (c) MSE vs SNR sweep
snr_dbs = np.linspace(-10, 30, 40)
mses = []
for snr_db in snr_dbs:
    snr_lin = 10**(snr_db/10)
    noise_std = np.sqrt(np.var(x_true) / snr_lin)
    y_n = y_clean + noise_std * np.random.randn(N)
    x_h = wiener_filter(np.fft.fft(y_n), H_gauss, snr_lin)
    mses.append(np.mean((x_h - x_true)**2) / np.var(x_true))

threshold_10pct = snr_dbs[np.argmax(np.array(mses) < 0.1)]
print(f'(c) Relative MSE < 10% achieved at SNR ≈ {threshold_10pct:.1f} dB')

plt.figure(figsize=(8, 4))
plt.semilogy(snr_dbs, mses, color=COLORS['primary'], lw=2)
plt.axhline(0.1, color=COLORS['error'], ls='--', label='10% threshold')
plt.axvline(threshold_10pct, color=COLORS['secondary'], ls=':')
plt.xlabel('SNR (dB)')
plt.ylabel('Relative MSE')
plt.title('Wiener Filter MSE vs. SNR')
plt.legend()
plt.tight_layout()
plt.savefig('/tmp/ex5_mse.png', dpi=100, bbox_inches='tight')
plt.show()

Exercise 6 ★★ — Power Spectral Density Estimation

Objective: Compare periodogram, averaged periodogram (Welch), and Bartlett methods for PSD estimation.

Generate a wide-sense stationary process: $x[n] = \sum_{k} a_k \cos(2\pi f_k n + \phi_k) + \sigma\eta[n]$ with 3 sinusoidal components at $f_1=0.1, f_2=0.25, f_3=0.4$ plus white noise.

(a) Compute and plot the periodogram $S_x(f) = |X(f)|^2/N$ for $N = 512$ . Note how noisy it is despite being an asymptotically unbiased estimator.

(b) Implement Welch's method: divide $x$ into overlapping 50%-overlapping segments of length $M=64$ , apply a Hann window to each, compute periodogram of each, and average. Compare with scipy.signal.welch.

(c) Verify the Wiener-Khinchin theorem: compute the PSD via FFT{R_xx[m]} and show it matches Welch's estimate.

(d) Using the Welch PSD, estimate the signal-to-noise ratio at each frequency peak. Compare to the true values.

Code cell 20

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 21

# Solution
# Exercise 6: PSD Estimation

from scipy.signal import welch as scipy_welch

np.random.seed(42)
N_psd = 1024
n_psd = np.arange(N_psd)

# True signal parameters
freqs_true = [0.10, 0.25, 0.40]
amps  = [2.0, 1.5, 1.0]
phases = np.random.uniform(0, 2*np.pi, 3)
noise_std_psd = 0.5

x_psd = sum(a * np.cos(2*np.pi*f*n_psd + phi)
            for a, f, phi in zip(amps, freqs_true, phases))
x_psd += noise_std_psd * np.random.randn(N_psd)

# (a) Periodogram
X_psd = np.fft.rfft(x_psd)
freqs_pg = np.fft.rfftfreq(N_psd)
S_periodogram = np.abs(X_psd)**2 / N_psd

# (b) Manual Welch
M_seg = 64
overlap = M_seg // 2
step = M_seg - overlap
win = np.hanning(M_seg)
win_power = np.sum(win**2)

starts = range(0, N_psd - M_seg + 1, step)
S_welch_manual = np.zeros(M_seg//2 + 1)
count = 0
for s in starts:
    seg = x_psd[s:s+M_seg] * win
    S_welch_manual += np.abs(np.fft.rfft(seg))**2 / win_power
    count += 1
S_welch_manual /= count
freqs_welch_m = np.fft.rfftfreq(M_seg)

# Scipy Welch
f_sw, S_sw = scipy_welch(x_psd, nperseg=M_seg, noverlap=overlap)

# (c) Wiener-Khinchin: PSD = FT{R_xx}
R_xx_wk = np.fft.ifft(np.abs(np.fft.fft(x_psd))**2 / N_psd).real
S_wk = np.abs(np.fft.rfft(R_xx_wk))  # FT of autocorrelation

print('(b) Welch manual vs scipy:')
err_welch = np.max(np.abs(S_welch_manual - S_sw))
print(f'    Max difference: {err_welch:.4f} (small differences expected due to normalization)')

# (d) Peak SNR estimation
noise_floor = np.median(S_sw)  # estimate noise floor from median
print(f'\n(d) Peak SNR estimation (noise floor ≈ {noise_floor:.4f}):')
for f_true, a_true in zip(freqs_true, amps):
    idx = np.argmin(np.abs(f_sw - f_true))
    snr_est = 10*np.log10(S_sw[idx] / noise_floor)
    snr_true = 10*np.log10(a_true**2 / 2 / (noise_std_psd**2 / N_psd))
    print(f'    f={f_true}: peak={S_sw[idx]:.3f}, SNR_est={snr_est:.1f}dB')

if HAS_MPL:
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))

    axes[0].semilogy(freqs_pg, S_periodogram, color=COLORS['neutral'],
                     alpha=0.5, label='Periodogram')
    axes[0].semilogy(f_sw, S_sw, color=COLORS['primary'], lw=2, label="Welch's")
    for ft in freqs_true:
        axes[0].axvline(ft, color=COLORS['error'], ls='--', alpha=0.5)
    axes[0].set_xlabel('Normalized Frequency')
    axes[0].set_ylabel('PSD')
    axes[0].set_title('PSD Estimation Comparison')
    axes[0].legend()

    axes[1].semilogy(freqs_welch_m, S_welch_manual, color=COLORS['secondary'],
                     lw=2, label='Manual Welch', ls='--')
    axes[1].semilogy(f_sw, S_sw, color=COLORS['primary'], lw=2,
                     label='Scipy Welch', alpha=0.7)
    axes[1].set_xlabel('Normalized Frequency')
    axes[1].set_title("Welch: Manual vs Scipy")
    axes[1].legend()

    plt.tight_layout()
    plt.savefig('/tmp/ex6_psd.png', dpi=100, bbox_inches='tight')
    plt.show()
print('PSD estimation complete.')

Exercise 7 ★★★ — Convolution in Neural Networks: Spectral Analysis

Objective: Understand CNN layers as frequency filters and connect to the spectral bias phenomenon.

(a) For a 1D convolutional layer with kernel size $K=5$ and $C_{out}=4$ output channels, plot the learned (randomly initialized) frequency responses. How do they compare to random filters?

(b) Spectral bias: train a 3-layer 1D CNN to fit $y = \sin(2\pi f_{low} n) + \sin(2\pi f_{high} n)$ with $f_{low}=0.05, f_{high}=0.45$ . Track the MSE for each frequency component separately during training. Show that the low frequency is learned first.

(c) Implement FFT convolution as a drop-in for np.convolve and benchmark the crossover point where FFT becomes faster than direct convolution. Plot time vs. len(x) for both methods.

(d) For a 2D image $I \in \mathbb{R}^{H\times W}$ , prove that the computational cost of FFT-based convolution is $O(HW \log(HW))$ vs. $O(HW K^2)$ for direct spatial convolution. At what kernel size $K$ does FFT become more efficient for a $256\times 256$ image?

Code cell 23

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 24

# Solution
# Exercise 7: CNN Spectral Analysis

# (a) Random 1D conv filter frequency responses
np.random.seed(42)
K = 5
C_out = 4
kernels_random = np.random.randn(C_out, K)
# Normalize each kernel
kernels_random /= np.linalg.norm(kernels_random, axis=1, keepdims=True)

from scipy.signal import freqz

if HAS_MPL:
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    cmap = plt.cm.viridis
    for i in range(C_out):
        w, h = freqz(kernels_random[i], worN=512)
        color = cmap(i/C_out)
        axes[0].plot(w/np.pi, 20*np.log10(np.abs(h)+1e-10),
                     color=color, lw=2, label=f'Ch{i}')
    axes[0].set_title('Random CNN Kernel Frequency Responses')
    axes[0].set_xlabel('Normalized frequency')
    axes[0].set_ylabel('|H(f)| dB')
    axes[0].legend()
    axes[0].set_ylim(-40, 5)

# (b) Spectral bias: FFT-based analysis
N_sb = 512
n_sb = np.arange(N_sb)
f_low, f_high = 0.05, 0.45
y_target = np.sin(2*np.pi*f_low*n_sb) + np.sin(2*np.pi*f_high*n_sb)

# Manual gradient descent on direct convolution weights
# Simulate spectral learning via frequency-domain analysis
K_sb = 11  # kernel size
np.random.seed(0)
w_weights = 0.01 * np.random.randn(K_sb)  # small init

lr = 0.001
x_in = np.ones(N_sb)  # simple input: DC
x_in[N_sb//2] = 1.0

# Track spectral components
mse_low_hist = []
mse_high_hist = []
W_hist_low = []
W_hist_high = []

for epoch in range(300):
    # Forward: convolve x_in with weights → prediction
    y_pred = np.convolve(x_in, w_weights, mode='same')

    # Get spectral content of prediction
    Y_pred_fft = np.fft.rfft(y_pred)
    Y_tgt_fft  = np.fft.rfft(y_target)
    freqs_sb = np.fft.rfftfreq(N_sb)

    idx_low  = np.argmin(np.abs(freqs_sb - f_low))
    idx_high = np.argmin(np.abs(freqs_sb - f_high))
    mse_low_hist.append(np.abs(Y_pred_fft[idx_low] - Y_tgt_fft[idx_low])**2)
    mse_high_hist.append(np.abs(Y_pred_fft[idx_high] - Y_tgt_fft[idx_high])**2)

    # Backward: gradient descent
    grad = np.correlate(y_pred - y_target, x_in, mode='full')
    center = len(grad) // 2
    grad_w = grad[center - K_sb//2 : center + K_sb//2 + 1]
    w_weights -= lr * grad_w / N_sb

if HAS_MPL:
    axes[1].semilogy(mse_low_hist, color=COLORS['primary'], lw=2,
                     label=f'f_low={f_low}')
    axes[1].semilogy(mse_high_hist, color=COLORS['error'], lw=2,
                     label=f'f_high={f_high}')
    axes[1].set_title('Spectral Bias: Low Freq Learned First')
    axes[1].set_xlabel('Epoch')
    axes[1].set_ylabel('Spectral MSE (log)')
    axes[1].legend()

    plt.tight_layout()
    plt.savefig('/tmp/ex7_spectral_bias.png', dpi=100, bbox_inches='tight')
    plt.show()

# (c) FFT vs direct crossover
def fft_conv(x, h):
    N_c = len(x) + len(h) - 1
    N_fft_c = 2**int(np.ceil(np.log2(N_c)))
    return np.fft.irfft(
        np.fft.rfft(x, n=N_fft_c) * np.fft.rfft(h, n=N_fft_c)
    )[:N_c]

K_bench = 11  # fixed kernel
h_bench = np.random.randn(K_bench)
lengths = [64, 128, 256, 512, 1024, 2048, 4096]
t_direct_list, t_fft_list = [], []

for L_test in lengths:
    x_test = np.random.randn(L_test)
    t0 = time.perf_counter()
    for _ in range(20): np.convolve(x_test, h_bench)
    t_direct_list.append((time.perf_counter()-t0)/20*1000)

    t0 = time.perf_counter()
    for _ in range(20): fft_conv(x_test, h_bench)
    t_fft_list.append((time.perf_counter()-t0)/20*1000)

# Crossover
crossover = [i for i,(a,b) in enumerate(zip(t_direct_list,t_fft_list)) if b<a]
if crossover:
    print(f'(c) FFT faster than direct starting at len(x) = {lengths[crossover[0]]}')
else:
    print('(c) Direct always faster at tested lengths (kernel too short)')

# (d) 2D crossover analysis
H_img, W_img = 256, 256
cost_fft_2d = H_img * W_img * np.log2(H_img * W_img)
print(f'\n(d) 2D image {H_img}x{W_img}:')
print(f'    FFT cost: O({cost_fft_2d:.0f})')
for K_2d in [3, 5, 7, 11, 15, 31]:
    cost_direct = H_img * W_img * K_2d**2
    print(f'    K={K_2d:3d}: direct cost={cost_direct:9.0f} | '
          f'{"FFT cheaper" if cost_fft_2d < cost_direct else "Direct cheaper"}')

Exercise 8 ★★★ — S4 Convolutional Kernel Analysis

Objective: Analyze how structured state spaces implement long-range convolution efficiently.

The S4 model computes $y = \bar{K} * x$ where $\bar{K}_t = C\bar{A}^t\bar{B}$ is determined by the SSM matrices. The HiPPO initialization makes $\bar{A}$ a specific structured matrix that preserves history optimally.

(a) For the diagonal SSM $\bar{A} = \text{diag}(\lambda_1,\ldots,\lambda_N)$ with $|\lambda_i| < 1$ , show that the convolutional kernel is a sum of decaying exponentials:

\bar{K}_t = \sum_{i=1}^N C_i B_i \lambda_i^t

Implement this and verify against the matrix power formula.

(b) Plot the frequency response $|\mathcal{F}\{\bar{K}\}(f)|$ of the S4 kernel for different choices of eigenvalues $\lambda$ (real decay, complex oscillation, near-unit-circle). How does the pole location affect the frequency selectivity?

(c) Demonstrate the computational advantage: compare O(L²) sequential SSM computation with O(L log L) FFT convolution for sequence lengths $L = 128, 512, 2048, 8192$ . Plot the speedup ratio.

(d) Show that Mamba's selective mechanism (input-dependent $B_t, C_t$ ) breaks the FFT trick: when $B$ and $C$ vary with the input, the output can no longer be written as a global convolution. Demonstrate with a simple example.

Code cell 26

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 27

# Solution
# Exercise 8: S4 Convolutional Kernel Analysis

# (a) Diagonal SSM: sum of decaying exponentials
N_ssm = 8  # state dimension
np.random.seed(42)

# Complex eigenvalues on unit disk
angles = np.linspace(0.2, np.pi-0.2, N_ssm//2)
r = 0.95  # radius
lambdas = r * np.exp(1j * angles)
lambdas = np.concatenate([lambdas, np.conj(lambdas)])  # conjugate pairs

# Random B, C
B_diag = np.random.randn(N_ssm) + 1j * np.random.randn(N_ssm)
C_diag = np.random.randn(N_ssm) + 1j * np.random.randn(N_ssm)

L_ker = 64

# Method 1: closed form K_t = sum_i C_i B_i lambda_i^t
K_closed = np.array([
    np.real(np.sum(C_diag * B_diag * lambdas**t))
    for t in range(L_ker)
])

# Method 2: matrix power (A is diagonal)
A_diag_mat = np.diag(lambdas)
K_matrix = np.array([
    np.real(C_diag @ np.linalg.matrix_power(A_diag_mat, t) @ B_diag)
    for t in range(L_ker)
])

err_a = np.max(np.abs(K_closed - K_matrix))
print(f'(a) Max error closed form vs matrix power: {err_a:.2e}')
print(f'    PASS: {err_a < 1e-8}')

# (b) Frequency response for different lambda configurations
configs = {
    'Real decay (λ=0.9)':    np.array([0.9]),
    'Near-unit circle':       np.array([0.99*np.exp(1j*0.3)]),
    'Oscillatory (λ=0.8e^{iπ/4})': np.array([0.8*np.exp(1j*np.pi/4)]),
}

if HAS_MPL:
    fig, axes = plt.subplots(1, 3, figsize=(13, 4))

    for i, (name, lam_cfg) in enumerate(configs.items()):
        # Simple 1D SSM: C=1, B=1
        K_cfg = np.array([np.real(lam_cfg[0]**t) for t in range(L_ker)])
        K_fft = np.fft.rfft(K_cfg, n=L_ker*2)
        freqs_cfg = np.fft.rfftfreq(L_ker*2)

        axes[i].plot(K_cfg[:32], color=COLORS['primary'], marker='o', ms=3)
        ax2 = axes[i].twinx()
        ax2.plot(np.arange(len(freqs_cfg)),
                 np.abs(K_fft), color=COLORS['secondary'],
                 lw=2, alpha=0.7)
        axes[i].set_title(name, fontsize=9)
        axes[i].set_xlabel('Time step t')
        axes[i].set_ylabel('K_t', color=COLORS['primary'])
        ax2.set_ylabel('|K(f)|', color=COLORS['secondary'])

    plt.suptitle('S4 Kernel Time and Frequency Domain', fontsize=13)
    plt.tight_layout()
    plt.savefig('/tmp/ex8_s4_kernel.png', dpi=100, bbox_inches='tight')
    plt.show()

# (c) Sequential SSM vs FFT convolution timing
def ssm_sequential(K_precomputed, x_seq):
    """Direct O(L²) computation."""
    L = len(x_seq)
    y_out = np.zeros(L)
    for t in range(L):
        for s in range(t+1):
            y_out[t] += K_precomputed[t-s] * x_seq[s]
    return y_out

def ssm_fft_conv(K_precomputed, x_seq):
    """O(L log L) FFT convolution."""
    L = len(x_seq)
    N_fft_c = 2**int(np.ceil(np.log2(2*L)))
    return np.fft.irfft(
        np.fft.rfft(K_precomputed, n=N_fft_c) *
        np.fft.rfft(x_seq, n=N_fft_c)
    )[:L]

seq_lengths = [64, 256, 512, 1024]
speedups = []
print('\n(c) Sequential vs FFT SSM computation:')
print(f'    {"L":>6}  {"Sequential(ms)":>16}  {"FFT(ms)":>10}  {"Speedup":>8}')

for L_test in seq_lengths:
    K_test = np.array([np.real(0.9**t) for t in range(L_test)])
    x_test = np.random.randn(L_test)

    if L_test <= 256:  # sequential is slow for large L
        t0 = time.perf_counter()
        ssm_sequential(K_test, x_test)
        t_seq = (time.perf_counter() - t0) * 1000
    else:
        t_seq = None

    t0 = time.perf_counter()
    for _ in range(10): ssm_fft_conv(K_test, x_test)
    t_fft_c = (time.perf_counter() - t0) / 10 * 1000

    if t_seq is not None:
        speedup = t_seq / t_fft_c
        speedups.append(speedup)
        print(f'    {L_test:>6}  {t_seq:>16.2f}  {t_fft_c:>10.3f}  {speedup:>8.1f}x')
    else:
        print(f'    {L_test:>6}  {"(too slow)":>16}  {t_fft_c:>10.3f}  {"N/A":>8}')

# (d) Mamba: selective mechanism breaks global convolution
print('\n(d) Mamba selective mechanism:')
L_m = 8
x_m = np.array([1.0, 0.5, -0.3, 0.8, -0.1, 0.4, 0.9, -0.6])

# Time-varying B_t (input-dependent)
B_selective = np.clip(x_m, 0.1, 1.0)  # B_t depends on x_t
lam_m = 0.9

# Sequential (correct for selective)
h_state = 0.0
y_selective = np.zeros(L_m)
for t in range(L_m):
    h_state = lam_m * h_state + B_selective[t] * x_m[t]
    y_selective[t] = h_state

# Fixed-B attempt (wrong for selective)
B_fixed = np.mean(B_selective)
K_fixed = np.array([B_fixed * lam_m**t for t in range(L_m)])
y_fixed_kernel = np.fft.irfft(
    np.fft.rfft(K_fixed, n=2*L_m) * np.fft.rfft(x_m, n=2*L_m)
)[:L_m]

err_m = np.mean(np.abs(y_selective - y_fixed_kernel))
print(f'    Mean error (selective vs fixed-B FFT): {err_m:.4f}')
print(f'    (Non-zero error confirms FFT trick fails for selective SSMs)')
print(f'    Selective output: {np.round(y_selective, 3)}')
print(f'    Fixed-B output:   {np.round(y_fixed_kernel, 3)}')

Solutions Guide

Exercise	Key Insight	Common Pitfall
1	N ≥ len(x)+len(h)−1 eliminates wrap-around	Using N=max(len(x),len(h)) — not enough
2	Exchange sum order using DFT definition	Forgetting 1/N factor in inverse DFT
3	Symmetric FIR → constant group delay	Confusing 0-1 normalized freq with Hz
4	Optimal block ≈ 4-8× filter length	Off-by-one in overlap-add accumulation
5	λ=1/SNR balances bias and noise	Using linear SNR where log-scale expected
6	Periodogram noisy; Welch trades variance for resolution	Wrong window normalization
7	FFT wins when len(x) ≫ K (crossover ~128-256)	Not zero-padding for FFT convolution
8	Selective B_t/C_t breaks global convolution viewpoint	Assuming Mamba = S4 (not true)

Further exploration:

Implement the overlap-save algorithm and compare with overlap-add
Extend the Wiener filter to 2D image deblurring
Implement a basic GCN layer using spectral graph convolution
Benchmark PyTorch nn.Conv1d vs manual FFT convolution

Exercise 9: Regularized Deconvolution

Recover a blurred signal using frequency-domain deconvolution. Compare naive division with Tikhonov-style regularization.

Code cell 30

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 31

# Solution
print("Exercise 9: Regularized Deconvolution")
n = 128
t = np.linspace(0, 1, n, endpoint=False)
x_true = np.sin(2*np.pi*5*t)
h = np.exp(-0.5*((np.arange(n)-n//2)/3)**2)
h = np.fft.ifftshift(h / h.sum())
y = np.fft.ifft(np.fft.fft(x_true) * np.fft.fft(h)).real
H = np.fft.fft(h)
lam = 1e-2
x_reg = np.fft.ifft(np.fft.fft(y) * np.conj(H) / (np.abs(H)**2 + lam)).real
err_blur = np.linalg.norm(y - x_true)
err_reg = np.linalg.norm(x_reg - x_true)
print("blur error:", round(float(err_blur), 6))
print("regularized recovery error:", round(float(err_reg), 6))
assert err_reg < err_blur
print("Takeaway: deconvolution needs regularization where the transfer function is small.")

Exercise 10: Overlap-Save Block Accounting

For a long sequence and FIR filter, compute the valid samples per block and number of FFT blocks required by overlap-save.

Code cell 33

# Your Solution
print("Write your solution here, then compare with the reference solution below.")

Code cell 34

# Solution
print("Exercise 10: Overlap-Save Accounting")
N_signal = 10000
M_filter = 129
N_fft = 512
valid_per_block = N_fft - M_filter + 1
blocks = int(np.ceil(N_signal / valid_per_block))
print("valid samples per block:", valid_per_block)
print("blocks required:", blocks)
assert valid_per_block > 0 and blocks == 27
print("Takeaway: overlap-save turns one long convolution into predictable FFT-sized blocks.")