DeepSeek: The AI Breakthrough Shaking Markets

Artificial Intelligence is constantly evolving. New LLMs are being released daily, but only a minority are making breakthroughs. DeepSeek is one of them.

DeepSeek represents a notable step forward in this field, offering a series of models designed to enhance reasoning capabilities in large language models (LLMs).

[You can read the full article here if you do not have a Medium subscription]

The foundation of this work is DeepSeek-R1-Zero, the initial model in the series. It was trained using large-scale reinforcement learning (RL) without any supervised fine-tuning (SFT). This direct application of RL enabled the model to develop reasoning behaviors such as self-verification, reflection, and chain-of-thought (CoT) problem-solving. However, limitations such as repetitive outputs and occasional language inconsistencies highlighted areas for improvement.

The next iteration, DeepSeek-R1, addresses these challenges by incorporating a cold-start data phase before RL training. This adjustment improved performance across tasks like math, coding, and reasoning, bringing the model closer to the benchmarks set by OpenAI-o1.

The first feedbacks that I can share on my side, is that the model does well on specific tasks (like Math), but behaves poorly on more classic tasks.

Importantly, the DeepSeek series has been open-sourced, along with several distilled versions, making these advancements more accessible to the…