🧠 GPT-152M — Trained From Scratch

A 152 million parameter language model built with raw PyTorch and trained on 197M tokens of educational text (FineWeb-Edu). No pretrained weights were used.

Best results: Use textbook-style prompts, not search queries.

Prompt

Generated Text

Max new tokens

50 300

Temperature (higher = more creative)

0.1 1.5

Top-k (lower = more focused)

10 100

Repetition penalty

1 2

Example prompts — click any to try

Model: GPT-152M | Dataset: FineWeb-Edu (197M tokens) | Hardware: Free Kaggle T4 GPU (~8.5 hours) | Framework: PyTorch 2.9

⚠️ This model was trained for educational purposes. Outputs may be factually incorrect.