READMEMath for LLMs

README

Fourier Analysis and Signal Processing

README

Chapter 20 - Fourier Analysis and Signal Processing

"The deep study of nature is the most fruitful source of mathematical discoveries."

  • Joseph Fourier, Theorie analytique de la chaleur, 1822

Overview

Fourier analysis is one of the most powerful and far-reaching ideas in all of mathematics: any periodic signal can be decomposed into a sum of sinusoidal oscillations, and any function can be represented as a continuous integral over frequencies. This chapter develops that idea from its origins in heat conduction through to its central role in modern large language models, convolutional neural networks, and sequence modeling.

The chapter follows a deliberate progression. We start with Fourier Series (Section 01) - the discrete-spectrum decomposition of periodic functions - and develop the convergence theory, the energy identity (Parseval), and the Gibbs phenomenon. We then take the period to infinity to obtain the Fourier Transform (Section 02), establishing its properties, the uncertainty principle, and the Plancherel theorem. Section Section 03 discretizes everything into the DFT and FFT, the algorithm that makes Fourier analysis computable in O(NlogN)O(N \log N) time. Section Section 04 proves the Convolution Theorem - the single most important theorem in signal processing - and traces it through CNNs, state space models, and attention mechanisms. Finally, Section 05 introduces Wavelets, which overcome the Fourier transform's time-localization blindspot and underpin modern multi-scale architectures.

Throughout, every mathematical result is connected to concrete AI/ML systems: RoPE positional encodings in LLaMA-3, FNet's replacement of attention with FFTs, Whisper's mel-spectrogram preprocessing, WaveNet's causal convolutions, Mamba's selective state spaces, and Mallat's scattering networks.


Subsection Map

#SubsectionWhat It CoversKey AI Connections
01Fourier SeriesTrigonometric basis, Fourier coefficients, convergence, Gibbs phenomenon, Parseval's theoremRoPE (LLaMA-3), sinusoidal PE (Transformers), spectral bias
02Fourier TransformContinuous FT, inversion, Plancherel, uncertainty principle, FT propertiesFNet, random Fourier features, spectral normalization
03DFT and FFTDFT matrix, Cooley-Tukey FFT, spectral leakage, windowing, STFTWhisper mel spectrograms, Monarch matrices, FNO
04Convolution TheoremConvolution-multiplication duality, circular convolution, cross-correlation, deconvolutionCNNs, WaveNet, S4/Mamba, Hyena long convolutions
05WaveletsMRA, Haar/Daubechies, CWT/DWT, filter banks, perfect reconstructionScattering networks, wavelet attention, JPEG 2000

Reading Order and Dependencies

01-Fourier-Series            (trigonometric basis, Fourier coefficients - start here)
        v
02-Fourier-Transform         (continuous FT; builds directly on Section 01)
        v
03-DFT-and-FFT               (discretizes Section 02; Cooley-Tukey algorithm)
        v
04-Convolution-Theorem       (uses FT from Section 02 and DFT from Section 03)
        v
05-Wavelets                  (overcomes FT limitations; uses MRA and filter banks from Section 04)
        v
21-Statistical-Learning-Theory   (next chapter)

How the Subsections Relate

Section 01 vs Section 02: Fourier Series is for periodic functions; the Fourier Transform handles aperiodic signals. Section 02 is derived from Section 01 by letting the period TT \to \infty - the discrete sum of harmonics becomes a continuous integral. Parseval's theorem appears in both.

Section 02 vs Section 03: The DFT in Section 03 is the numerically computable version of the continuous FT in Section 02. The relationship is: sample a continuous signal at NN points -> apply DFT -> approximate the continuous spectrum. The FFT makes this feasible in O(NlogN)O(N \log N).

Section 03 vs Section 04: The Convolution Theorem in Section 04 is what makes the DFT from Section 03 practically useful. Circular convolution in the DFT domain becomes pointwise multiplication - reducing O(N2)O(N^2) convolution to O(NlogN)O(N \log N) FFT-based convolution.

Section 04 vs Section 05: The Convolution Theorem in Section 04 characterizes LTI systems. Wavelets in Section 05 use iterated convolution with filter banks (low-pass / high-pass) to build multi-scale signal decompositions. The perfect reconstruction conditions in Section 05 are the synthesis side of Section 04's filter theory.

Section 01-Section 04 vs Section 05: Fourier analysis gives global frequency information with no time localization. Wavelets sacrifice some frequency precision to gain time localization - a direct consequence of the uncertainty principle from Section 02.


What Belongs Where

TopicCanonical HomeReferenced In
Trigonometric Fourier series, Gibbs phenomenonSection 01Section 02 (limit argument)
Fourier Transform, Plancherel, uncertainty principleSection 02Section 03 (discretization), Section 04 (proof)
DFT matrix, FFT algorithm, spectral leakageSection 03Section 04 (circular convolution)
Convolution theorem, cross-correlation, filter designSection 04Section 03 (circular), Section 05 (filter banks)
Wavelet transform, MRA, Daubechies waveletsSection 05Section 04 (filter bank connection)
L2L^2 Hilbert space theorySection 12-02 (Hilbert Spaces)Section 01 preview
Spectral graph convolution (GCN, ChebNet)Section 11-04 (Spectral Graph Theory)Section 04 brief reference
Kernel methods, RKHS, Bochner's theoremSection 12-03 (Kernel Methods)Section 02 brief reference

Prerequisites

Before starting this chapter:

  • Calculus - integration, complex exponentials, Taylor series (Chapter 4)
  • Linear Algebra - matrices, inner products, eigenvalues (Chapter 2-3)
  • Complex numbers - eiθ=cosθ+isinθe^{i\theta} = \cos\theta + i\sin\theta, modulus, argument (Chapter 1)
  • L2L^2 function spaces - inner product, norm, completeness (Chapter 12, Section 02 Hilbert Spaces)

Sections Section 01-Section 03 need primarily calculus and complex analysis. Sections Section 04-Section 05 additionally need the abstract framework from Section 12.


After This Chapter

This chapter prepares you for:


<- Back to Curriculum Home | Previous Chapter: Production ML and MLOps <- | Next Chapter: Statistical Learning Theory ->