SambaNova | The Fastest AI Inference Platform & Hardware

DeepSeek

We support the groundbreaking DeepSeek models, including the 671-billion-parameter DeepSeek-R1, which excels in coding, reasoning, and mathematics at a fraction of the cost of other models.

On our SambaNova RDU, DeepSeek-R1 achieves remarkable speeds of up to 200 tokens / second, as measured independently by Artificial Analysis.

Llama

As a launch partner for Meta's Llama 4 series, we've been at the forefront of open-source AI innovation. SambaCloud was the first platform to support all three variants of Llama 3.1 (8B, 70B, and 405B) with fast inference.

We are excited to work with Meta to deliver fast inference on both Scout and Maverick models.

OpenAI gpt-oss-120b

OpenAI recently released gpt-oss-120b, a model that delivers high accuracy in just 120-billion parameter with a Mixture of Experts (MoE) architecture.

As a small but efficient model, it runs extremely fast on SambaNova RDUs at over 600 tokens per second, making it a great choice for near real-time agentic AI.