Mar 25, 2025

Google unveils a next-gen family of AI reasoning models

Google Unveils a Next-Gen Family of AI Reasoning Models

On Tuesday, Google unveiled Gemini 2.5, a new family of AI reasoning models that pauses to “think” before answering a question.

To kick off the new family of models, Google is launching Gemini 2.5 Pro Experimental, a multimodal, reasoning AI model that the company claims is its most intelligent model yet. This model will be available on Tuesday in the company’s developer platform, Google AI Studio, as well as in the Gemini app for subscribers to the company’s $20-a-month AI plan, Gemini Advanced.

Moving forward, Google says all of its new AI models will have reasoning capabilities baked in.

The AI Reasoning Model Race

Since OpenAI launched the first AI reasoning model in September 2024, o1, the tech industry has raced to match or exceed that model’s capabilities. Today, major AI players such as:

Anthropic
DeepSeek
Google
xAI

...all have AI reasoning models that use extra computing power and time to fact-check and reason through problems before delivering an answer.

Reasoning techniques have helped AI models achieve new heights in math and coding tasks. Many in the tech world believe reasoning models will be a key component of AI agents, autonomous systems that can perform tasks largely without human intervention. However, these models are also more expensive.

Google’s Competitive Edge

Google has experimented with AI reasoning models before, previously releasing a “thinking” version of Gemini in December. But Gemini 2.5 represents the company’s most serious attempt yet at besting OpenAI’s o series of models.

Google claims that Gemini 2.5 Pro outperforms its previous frontier AI models and some of the leading competing AI models on several benchmarks:

Code Editing (Aider Polyglot) – Gemini 2.5 Pro scores 68.6%, outperforming models from OpenAI, Anthropic, and DeepSeek.
Software Development Abilities (SWE-bench Verified) – Gemini 2.5 Pro scores 63.8%, beating OpenAI’s o3-mini and DeepSeek’s R1 but lagging behind Anthropic’s Claude 3.7 Sonnet (70.3%).
Humanity’s Last Exam – A test measuring knowledge in mathematics, humanities, and natural sciences, where Gemini 2.5 Pro scores 18.8%, performing better than most flagship models.

Massive Context Window & Future Enhancements

To start, Google says Gemini 2.5 Pro ships with a 1 million token context window, allowing it to process roughly 750,000 words in a single go—longer than the entire Lord of The Rings book series! Soon, Gemini 2.5 Pro will support 2 million tokens, doubling its input length.

Google has not yet published API pricing for Gemini 2.5 Pro but promises to share more details in the coming weeks.

Google unveils a next-gen family of AI reasoning models