Papers
Select a paper to start implementing.
Less is More: Recursive Reasoning with Tiny Networks
Recursive reasoning with tiny networks, focusing on latent state updates and answer refinement.
Vision Transformer (ViT)
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale - applying Transformers directly to image patches for vision tasks.
Human-level Control through Deep Reinforcement Learning
Deep G-Network (DQN) combining G-learning with deep neural networks for end-to-end learning of action values from raw pixels.
Adam: A Method for Stochastic Optimization
Adaptive moment estimation optimizer combining benefits of RMSProp and momentum, computing individual adaptive learning rates.
Long Short-Term Memory Networks
Implementing core LSTM components from scratch: LSTM cells with gates, forward/backward passes, BPTT, initialization, dropout masks, packed sequences, bidirectional LSTMs, and full LSTM blocks.
Generative Adversarial Networks
Framework for estimating generative models via an adversarial process.
Attention Is All You Need
The seminal transformer architecture replacing recurrence and convolutions entirely with self-attention mechanisms.
World Models
Training generative neural network models of popular reinforcement learning environments to learn a compressed representation of the spatial and temporal aspects of the environment.
Recurrent Neural Networks
Fundamental sequential processing architecture forming the basis of modern recurrent neural architectures.