Gemini 3 Flash: frontier intelligence built for speed

6 min read Original article ↗

Gemini 3 Flash is our latest model with frontier intelligence built for speed that helps everyone learn, build, and plan anything — faster.

General summary

Google is releasing Gemini 3 Flash, a fast and cost-effective model built for speed. You can now access Gemini 3 Flash through the Gemini app and AI Mode in Search. Developers can access it via the Gemini API in Google AI Studio, Google Antigravity, Gemini CLI, Android Studio, Vertex AI and Gemini Enterprise.

Summaries were generated by Google AI. Generative AI is experimental.

Bullet points

  • "Gemini 3 Flash: frontier intelligence built for speed" introduces a fast, efficient AI model.
  • Gemini 3 Flash offers Pro-grade reasoning at Flash-level speed and a lower cost.
  • It's great for coding, complex analysis, and quick answers in interactive apps.
  • Gemini 3 Flash is now the default model in the Gemini app and AI Mode in Search.
  • Developers and everyday users can access Gemini 3 Flash via various Google platforms.

Summaries were generated by Google AI. Generative AI is experimental.

Explore other styles:

Gemini 3 Flash text

Your browser does not support the audio element.

Listen to article

This content is generated by Google AI. Generative AI is experimental

[[duration]] minutes

Today, we're expanding the Gemini 3 model family with the release of Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost. With this release, we’re making Gemini 3’s next-generation intelligence accessible to everyone across Google products.

Last month, we kicked off Gemini 3 with Gemini 3 Pro and Gemini 3 Deep Think mode, and the response has been incredible. Since launch day, we have been processing over 1T tokens per day on our API. We’ve seen you use Gemini 3 to vibe code simulations to learn about complex topics, build and design interactive games and understand all types of multimodal content.

With Gemini 3, we introduced frontier performance across complex reasoning, multimodal and vision understanding and agentic and vibe coding tasks. Gemini 3 Flash retains this foundation, combining Gemini 3's Pro-grade reasoning with Flash-level latency, efficiency and cost. It not only enables everyday tasks with improved reasoning, but also is our most impressive model for agentic workflows.

Starting today, Gemini 3 Flash is rolling out to millions of people globally:

Gemini 3 Flash: frontier intelligence at scale

Gemini 3 Flash demonstrates that speed and scale don’t have to come at the cost of intelligence. It delivers frontier performance on PhD-level reasoning and knowledge benchmarks like GPQA Diamond (90.4%) and Humanity’s Last Exam (33.7% without tools), rivaling larger frontier models, and significantly outperforming even the best 2.5 model, Gemini 2.5 Pro, across a number of benchmarks. It also reaches state-of-the-art performance with an impressive score of 81.2% on MMMU Pro, comparable to Gemini 3 Pro.

A benchmark comparison table showing performance scores and prices for several language models including Gemini 3 Flash, Gemini 3 Pro Thinking, Gemini 2.5 Flash Thinking, Gemini 2.5 Pro Thinking, Claude Sonnet 4.5, GPT-5.2 Extra high, and Grok 4.1 Fast, across various tasks like academic reasoning, scientific knowledge, math, multi-modal understanding, coding, and long context performance.

In addition to its frontier-level reasoning and multimodal capabilities, Gemini 3 Flash was built to be highly efficient, pushing the Pareto frontier of quality vs. cost and speed. When processing at the highest thinking level, Gemini 3 Flash is able to modulate how much it thinks. It may think longer for more complex use cases, but it also uses 30% fewer tokens on average than 2.5 Pro, as measured on typical traffic, to accurately complete everyday tasks with higher performance.

Gemini 3 Flash pushes the Pareto frontier on performance vs. cost and speed.

Performance, here, is measured by LMArena Elo Score.

A scatter plot showing LMArena Elo Score versus Price per million tokens for various language models, with a line highlighting the Pareto frontier through 'gemini-3-pro', 'gemini-3-flash', and 'gemini-3-flash-lite'.

Gemini 3 Flash’s strength lies in its raw speed, building on the Flash series that developers and consumers already love. It outperforms 2.5 Pro while being 3x faster (based on Artificial Analysis benchmarking) at a fraction of the cost. Gemini 3 Flash is priced at $0.50/1M input tokens and $3/1M output tokens (audio input remains at $1/1M input tokens).

For developers: intelligence that keeps up

Gemini 3 Flash is made for iterative development, offering Gemini 3’s Pro-grade coding performance with low latency — it’s able to reason and solve tasks quickly in high-frequency workflows. On SWE-bench Verified, a benchmark for evaluating coding agent capabilities, Gemini 3 Flash achieves a score of 78%, outperforming not only the 2.5 series, but also Gemini 3 Pro. It strikes an ideal balance for agentic coding, production-ready systems and responsive interactive applications.

Gemini 3 Flash’s strong performance in reasoning, tool use and multimodal capabilities is ideal for developers looking to do more complex video analysis, data extraction and visual Q&A, which means it can enable more intelligent applications — like in-game assistants or A/B test experiments — that demand both quick answers and deep reasoning.

We’ve received a tremendous response from companies using Gemini 3 Flash. Companies like JetBrains, Bridgewater Associates, and Figma are already using it to transform their businesses, recognizing how its inference speed, efficiency and reasoning capabilities perform on par with larger models. Gemini 3 Flash is available today to enterprises via Vertex AI and Gemini Enterprise.

For everyone: Gemini 3 Flash is rolling out globally

Gemini 3 Flash is now the default model in the Gemini app, replacing 2.5 Flash. That means all of our Gemini users globally will get access to the Gemini 3 experience at no cost, giving their everyday tasks a major upgrade.

Because of Gemini 3 Flash’s incredible multimodal reasoning capabilities, you can use it to help you see, hear and understand any type of information faster. For example, you can ask Gemini to understand your videos and images and turn that content into a helpful and actionable plan in just a few seconds.

Or you can quickly build fun, useful apps from scratch using your voice without prior coding knowledge. Just dictate to Gemini on the go, and it can transform your unstructured thoughts into a functioning app in minutes.

Gemini 3 Flash is also starting to roll out as the default model for AI Mode in Search with access to everyone around the world.

Building on the reasoning capabilities of Gemini 3 Pro, AI Mode with Gemini 3 Flash is more powerful at parsing the nuances of your question. It considers each aspect of your query to serve thoughtful, comprehensive responses that are visually digestible — pulling real-time local information and helpful links from across the web. The result effectively combines research with immediate action: you get an intelligently organized breakdown alongside specific recommendations — at the speed of Search.

This shines when tackling complex goals with multiple considerations like trying to plan a last-minute trip or learning complex educational concepts quickly.

Try Gemini 3 Flash today

Gemini 3 Flash is available now in preview via the Gemini API in Google AI Studio, Google Antigravity, Vertex AI and Gemini Enterprise. You can also access it through other developer tools like Gemini CLI and Android Studio. It’s also starting to roll out to everyone in the Gemini app and AI Mode in Search, bringing fast access to next-generation intelligence at no cost.

We’re looking forward to seeing what you bring to life with this expanded family of models: Gemini 3 Pro, Gemini 3 Deep Think and now, Gemini 3 Flash.