🧠 GPT-152M — Trained From Scratch

A 152 million parameter language model built with raw PyTorch and trained on 197M tokens of educational text (FineWeb-Edu). No pretrained weights were used.

Best results: Use textbook-style prompts, not search queries.

50 300
0.1 1.5
10 100
1 2
Example prompts — click any to try

Model: GPT-152M | Dataset: FineWeb-Edu (197M tokens) | Hardware: Free Kaggle T4 GPU (~8.5 hours) | Framework: PyTorch 2.9

⚠️ This model was trained for educational purposes. Outputs may be factually incorrect.