Axiom ❖
A fully autonomous, decoder-only Transformer (GPT) model built, trained, and deployed effectively from scratch using PyTorch.
This project demonstrates a complete LLM lifecycle:
- Synthetic Data Generation: Creates a "Pseudo-Wikipedia" dataset (100k samples) with Math, Code, Facts, and Stories.
- Model Implementation: A clean, educational implementation of
GPT,CausalSelfAttention, andBlocklayers in PyTorch. - Training Loop: Custom training loop with AdamW optimizer.
- Interactive UI: A Streamlit web app for real-time inference.
🚀 Quick Start
1. Install Dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt2. Generate Data
Create the synthetic training dataset (100,000 samples).
python generate_data.py
# Outputs: synthetic_data.txt3. Train the Model
Train the 8-layer GPT model on your local machine (Mac M1/M2/M3 optimized).
python train.py
# Training takes ~5-10 mins on M1. Saves to: model.pt4. Run the Interface
Launch the interactive playground in your browser.
🧠 Model Architecture (Phase 3)
- Parameters: 6.39 Million
- Layers: 8
- Heads: 8
- Embedding Dim: 256
- Context Window: 128 tokens
- Tokenizer: Character-level (67 vocab size)
📂 Project Structure
model.py: The core GPT architecture (PyTorch).train.py: Training script with data loader and optimizer.data.py: Dataset class and tokenizer utilities.generate.py: CLI script for text generation.app.py: Streamlit frontend.tokenizer.py: Character tokenizer implementation.generate_data.py: Synthetic data generator.
📝 Example Capabilities
The model learns English grammar structure and can perform basic recall:
Reflexive Q&A (Reasoning)
Pseudo-Fluent English (Creative Writing)
Title: The Sun
"The sad history helps the machine. The mountain writes the cell..."

Basic Code Syntax (Functions)
Built entirely from scratch By Gemini.

