Documentation
Learn how microgpt works - a minimal GPT implementation in pure Python
microgpt Documentation
A complete, working GPT model in ~250 lines of pure Python with zero dependencies.
What is microgpt?
microgpt is an educational implementation by Andrej Karpathy that demonstrates exactly how a language model works. No PyTorch, no NumPy—just Python's standard library.
python microgpt.pyThat's it. The model downloads a dataset, trains, and generates text.
Quick Links
Documentation Sections
| Section | Description |
|---|---|
| Getting Started | Introduction and overview |
| Tokenization | Converting text to numbers |
| Foundations | Gradients and parameters |
| Architecture | The GPT model components |
| Training | How models learn |
| Autograd | Automatic differentiation |
| Generation | Text generation |
| Code Reference | Line-by-line explanation |
How It Works
The entire pipeline:
- Tokenize - Convert characters to integers
- Embed - Convert integers to vectors
- Transform - Apply attention and MLP layers
- Predict - Generate probability distributions
- Train - Use backpropagation to improve
- Generate - Sample from the model
Each component is explained in detail in the sections above.