a complete GPTin ~250 linesof Python

Zero dependencies. No PyTorch. No NumPy. Just Python's standard library. Learn how language models work from the ground up.

terminal

$ python microgpt.py
Downloading dataset... Done
Training... 10/10000
Loss: 4.23
"The " -> " dog"

Start LearningEstimated time: 2 hours

250Lines

0Dependencies

1File

MITLicense

Minimal

One file. Zero dependencies. Pure Python standard library.

Educational

Every line is readable. No magic. No abstraction hiding the math.

Executable

Run it yourself. Watch it train. See the loss go down.

Foundational

Understand transformers from the ground up. Build your intuition.

How it works

From text to tokens to predictions

01Tokenize

02Embed

03Transform

04Predict

05Train

06Generate

Comparison

Tiny compared to the giants

Library	Lines of Code	Dependencies	Files
microgptthis	250	0	1
micrograd	~100	0	1
PyTorch	~1M	Many	Many
TensorFlow	~500K	Many	Many

Documentation

Everything you need to understand transformers

Getting Started

Introduction and overview

Tokenization

Converting text to numbers

Architecture

GPT model components

Training

How models learn

Autograd

Automatic differentiation

Generation

Text generation

"If you really want to understand how GPT works, you should read the code. And if you want to understand the code, you should start here."
Andrej KarpathyComputer Scientist

The Code

Readable. Editable. Yours.

Every function fits on your screen. No jumping between files. No hidden abstractions. Just clean, understandable Python.

View the source

microgpt.py

class Linear:
  def __init__(self, nin, nout):
    self.W = torch.randn(nin, nout)
    self.b = torch.zeros(nout)

  def __call__(self, x):
    return x @ self.W + self.b

class Module:
  def parameters(self):
    return []

class Transformer(Module):
  def __init__(self):
    self.attention = SelfAttention()
    self.mlp = MLP()

  def __call__(self, x):
    x = self.attention(x)
    return self.mlp(x)

Ready to dive in?

Start with the basics and work your way through the entire pipeline.

Read the docs

Want the code?

Grab the source and run it yourself. It's just one Python file.

View on GitHub