Videos
A short list, worth your time.
3:31:00Deep Dive
Deep Dive into LLMs like ChatGPT
Karpathy's 3.5-hour deep dive on the full training pipeline — pretraining, tokenization, fine-tuning, RLHF, and how to actually use these models.
Andrej Karpathy
59:48Foundations
[1hr Talk] Intro to Large Language Models
The single best high-signal intro to LLMs — what they are, how they're trained, where they're going, and the security model around them.
Andrej Karpathy
1:56:20Engineering
Let's build GPT: from scratch, in code, spelled out.
Build a working GPT from zero in PyTorch — attention, multi-head, residuals, layer norm, the works. Pair it with nanoGPT to keep going.
Andrej Karpathy
2:13:35Engineering
Let's build the GPT Tokenizer
Why tokenization is the source of half your weird LLM bugs, and how to build a BPE tokenizer end-to-end so you can debug them.
Andrej Karpathy