AI research is usually a game for big tech companies with deep pockets. However, a team at UC Berkeley just flipped the script. They have replicated the core abilities of DeepSeek R1-Zero for just $30 (not a typo). Their project, called TinyZero, proves that advanced AI reasoning models don’t have to cost a fortune. Not to mention, AI research is becoming more accessible than ever.

Led by Jiayi Pan, the researchers aimed at recreating DeepSeek’s reasoning model using reinforcement learning (RL). Instead of relying on expensive cloud services or massive compute power, they trained TinyZero with a basic language model, a simple prompt, and a reward system.

Pan shared his excitement on X (formerly Twitter), saying, “You can experience the ‘Aha’ moment yourself for < $30.” He also described TinyZero as the first open reproduction of reasoning models, highlighting how it learned to verify and refine its own answers.
How TinyZero was developed
To test the model, the researchers used a game called Countdown, where players have to reach a target number using basic math operations. While at first, TinyZero guessed randomly, over time, it learned to verify its answers, search for better ones, and adjust accordingly.
They experimented with different model sizes, from 500 million parameters to 7 billion parameters. The results? Smaller models (0.5B parameters) just guessed answers and stopped. Larger models (1.5B+ parameters) learned to self-verify, refine solutions, and significantly improve accuracy.
What makes TinyZero truly impressive is how cheap it is compared to traditional AI models. Just look at these costs:
- OpenAI’s API: $15 per million tokens
- DeepSeek-R1: $0.55 per million tokens
- TinyZero’s total cost: $30—one-time training cost
This means anyone—not just big tech—can experiment with AI reasoning models without breaking the bank.
Availability
TinyZero is open-source and available on GitHub, so anyone can tinker with it. While it’s currently tested only in the Countdown game, Pan hopes this project will make reinforcement learning research more accessible.
Of course, it’s still early days. “One caveat, of course, is that it’s validated only in the Countdown task but not the general reasoning domain,” Pan admitted. But even with that limitation, the impact is clear: AI development doesn’t have to be expensive. With projects like TinyZero, affordable, open-source AI could be the future.
Stay ahead in tech! Join our Telegram community and sign up for our daily newsletter of top stories!
For more daily updates, please visit our News Section.