microgpt Documentation

A complete, working GPT model in ~250 lines of pure Python with zero dependencies.

What is microgpt?

microgpt is an educational implementation by Andrej Karpathy that demonstrates exactly how a language model works. No PyTorch, no NumPy—just Python's standard library.

python microgpt.py

That's it. The model downloads a dataset, trains, and generates text.

Quick Links

Getting Started

Quick Start

Neural Networks

Transformers

Documentation Sections

Section	Description
Getting Started	Introduction and overview
Tokenization	Converting text to numbers
Foundations	Gradients and parameters
Architecture	The GPT model components
Training	How models learn
Autograd	Automatic differentiation
Generation	Text generation
Code Reference	Line-by-line explanation

How It Works

The entire pipeline:

Tokenize - Convert characters to integers
Embed - Convert integers to vectors
Transform - Apply attention and MLP layers
Predict - Generate probability distributions
Train - Use backpropagation to improve
Generate - Sample from the model

Each component is explained in detail in the sections above.

Documentation