GitHub - javantanna/Axiom: A GPT built by GPT

Axiom ❖

A fully autonomous, decoder-only Transformer (GPT) model built, trained, and deployed effectively from scratch using PyTorch. (hoestly its sooo dumb)

This project demonstrates a complete LLM lifecycle:

Synthetic Data Generation: Creates a "Pseudo-Wikipedia" dataset (100k samples) with Math, Code, Facts, and Stories.
Model Implementation: A clean, educational implementation of GPT, CausalSelfAttention, and Block layers in PyTorch.
Training Loop: Custom training loop with AdamW optimizer.
Interactive UI: A Streamlit web app for real-time inference.

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Create the synthetic training dataset (100,000 samples).

python generate_data.py
# Outputs: synthetic_data.txt

Train the 8-layer GPT model on your local machine (Mac M1/M2/M3 optimized).

python train.py
# Training takes ~5-10 mins on M1. Saves to: model.pt

Launch the interactive playground in your browser.

The model learns English grammar structure and can perform basic recall:

Q: What is 1+1?

Title: The Sun "The sad history helps the machine. The mountain writes the cell..."

Code:

Built entirely from scratch By Gemini.