Google launches Gemini 3.1 Pro, retaking AI crown with 2X+ reasoning performance boost

Late last year, Google briefly took the crown for most powerful AI model in the world with the launch of Gemini 3 Pro — only to be surpassed within weeks by OpenAI and Anthropic releasing new models, s is common in the fiercely competitive AI race.Now Google is back to retake the throne with an updated version of that flagship model: Gemini 3.1 Pro, positioned as a smarter baseline for tasks where a simple response is insufficient—targeting science, research, and engineering workflows that demand deep planning and synthesis.Already, evaluations by third-party firm Artificial Analysis show that Google's Gemini 3.1 Pro has leapt to the front of the pack and is once more the most powerful and performant AI model in the world. A big leap in core reasoningThe most significant advancement in Gemini 3.1 Pro lies in its performance on rigorous logic benchmarks. Most notably, the model achieved a verified score of 77.1% on ARC-AGI-2. This specific benchmark is designed to evaluate a model's ab

Google launches Gemini 3.1 Pro, retaking AI crown with 2X+ reasoning performance boost

Late last year, Google briefly took the crown for most powerful AI model in the world with the launch of Gemini 3 Pro — only to be surpassed within weeks by OpenAI and Anthropic releasing new models, s is common in the fiercely competitive AI race.

Now Google is back to retake the throne with an updated version of that flagship model: Gemini 3.1 Pro, positioned as a smarter baseline for tasks where a simple response is insufficient—targeting science, research, and engineering workflows that demand deep planning and synthesis.

Already, evaluations by third-party firm Artificial Analysis show that Google's Gemini 3.1 Pro has leapt to the front of the pack and is once more the most powerful and performant AI model in the world.

A big leap in core reasoning

The most significant advancement in Gemini 3.1 Pro lies in its performance on rigorous logic benchmarks. Most notably, the model achieved a verified score of 77.1% on ARC-AGI-2.

This specific benchmark is designed to evaluate a model's ability to solve entirely new logic patterns it has not encountered during training.

This result represents more than double the reasoning performance of the previous Gemini 3 Pro model.

Beyond abstract logic, internal benchmarks indicate that 3.1 Pro is highly competitive across specialized domains:

  • Scientific Knowledge: It scored 94.3% on GPQA Diamond.

  • Coding: It reached an Elo of 2887 on LiveCodeBench Pro and scored 80.6% on SWE-Bench Verified.

  • Multimodal Understanding: It achieved 92.6% on MMMLU.

These technical gains are not just incremental; they represent a refinement in how the model handles "thinking" tokens and long-horizon tasks, providing a more reliable foundation for developers building autonomous agents.

Improved vibe coding and 3D synthesis

Google is demonstrating the model’s utility through "intelligence applied"—shifting the focus from chat interfaces to functional outputs.

One of the most prominent features is the model's ability to generate "vibe-coded" animated SVGs directly from text prompts. Because these are code-based rather than pixel-based, they remain scalable and maintain tiny file sizes compared to traditional video, boasting far more detailed, presentable and professional visuals for websites and presentations and other enterprise applications.

Other showcased applications include:

  • Complex System Synthesis: The model successfully configured a public telemetry stream to build a live aerospace dashboard visualizing the International Space Station’s orbit.

  • Interactive Design: In one demo, 3.1 Pro coded a complex 3D starling murmuration that users can manipulate via hand-tracking, accompanied by a generative audio score.

  • Creative Coding: The model translated the atmospheric themes of Emily Brontë’s Wuthering Heights into a functional, modern web design, demonstrating an ability to reason through tone and style rather than just literal text.

Business impact and community reactions

Enterprise partners have already begun integrating the preview version of 3.1 Pro, reporting noticeable improvements in reliability and efficiency.

Vladislav Tankov, Director of AI at JetBrains, noted a 15% quality improvement over previous versions, stating the model is "stronger, faster... and more efficient, requiring fewer output tokens". Other industry reactions include:

  • Databricks: CTO Hanlin Tang reported that the model achieved "best-in-class results" on OfficeQA, a benchmark for grounded reasoning across tabular and unstructured data.

  • Cartwheel: Co-founder Andrew Carr highlighted the model's "substantially improved understanding of 3D transformations," noting it resolved long-standing rotation order bugs in 3D animation pipelines.

  • Hostinger Horizons: Head of Product Dainius Kavoliunas observed that the model understands the "vibe" behind a prompt, translating intent into style-accurate code for non-developers.

Pricing, licensing, and availability

For developers, the most striking aspect of the 3.1 Pro release is the "reasoning-to-dollar" ratio. When Gemini 3 Pro launched, it was positioned in the mid-high price range at $2.00 per million input tokens for standard prompts. Gemini 3.1 Pro maintains this exact pricing structure, effectively offering a massive performance upgrade at no additional cost to API users.

  • Input Price: $2.00 per 1M tokens for prompts up to 200k; $4.00 per 1M tokens for prompts over 200k.

  • Output Price: $12.00 per 1M tokens for prompts up to 200k; $18.00 per 1M tokens for prompts over 200k.

  • Context Caching: Billed at $0.20 to $0.40 per 1M tokens depending on prompt size, plus a storage fee of $4.50 per 1M tokens per hour.

  • Search Grounding: 5,000 prompts per month are free, followed by a charge of $14 per 1,000 search queries.

For consumers, the model is rolling out in the Gemini app and NotebookLM with higher limits for Google AI Pro and Ultra subscribers.

Licensing implications

As a proprietary model offered through Vertex Studio in Google Cloud and the Gemini API, 3.1 Pro follows a standard commercial SaaS (Software as a Service) model rather than an open-source license.

For enterprise users, this provides "grounded reasoning" within the security perimeter of Vertex AI, allowing businesses to operate on their own data with confidence.

The "Preview" status allows Google to refine the model's safety and performance before general availability, a common practice in high-stakes AI deployment.

By doubling down on core reasoning and specialized benchmarks like ARC-AGI-2, Google is signaling that the next phase of the AI race will be won by models that can think through a problem, not just predict the next word.

Share

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0