ChatGPT/Every illustration.

Vibe Check: Gemini 2.5 Pro and Gemini 2.5 Flash

Why Google might quietly win the race to be AI’s top backend provider

41 1

Was this newsletter forwarded to you? Sign up to get it in your inbox.


Google's Gemini models may not dominate conversations (or searches) like OpenAI’s—but they’re starting to dominate something more important: the developer software stack.

Inside Every, Gemini 2.5 Pro and Gemini 2.5 Flash are already powering production workflows, and Flash runs quietly in the background of products like Cora and Sparkle. Our team is hardly the only one getting mileage out of these models, though; Pro has become the default brain inside go-to AI-powered developer tools Cursor and Windsurf. And according to Google Cloud CEO Thomas Kurian, more than four million developers are building with Gemini

With a fresh update to Gemini 2.5 Pro landing this week that’s meant to have stronger coding support and wider developer access, Google’s bid to win the hearts and minds of developers is getting harder to ignore. 

Let’s dig into what Pro and Flash do best, how the Every team is putting them to work, and why Gemini could be the backend stack’s dark horse.

Gemini 2.5 Pro: The quietly powerful workhorse

Gemini 2.5 Pro debuted in March 2025 as Google’s first “thinking model,” a territory previously mapped by OpenAI with its o1 release in September 2024 and Anthropic with the release of Claude Sonnet 3.7 in February 2025. A thinking model, also known as a reasoning model, is an LLM that pauses to plan a step-by-step solution before answering. That extra planning, plus a 1 million-token context window—enough to read an entire codebase, a full research report, or about an hour of video—lets it handle problems other models have to tackle in bite-sized chunks. 

On May 6, ahead of its annual I/O conference later this month, Google announced an update to 2.5 Pro (you may see it referred to as “Gemini-2.5-Pro-05-06,” in case you thought OpenAI was the only one with naming challenges). This launch touts sharper coding skills, richer web-app demos (click-to-try sample sites that let anyone play around with the model inside a browser), and, crucially, general access in AI Studio (Google’s free playground for quick experiments) and Vertex AI (its managed cloud service for production workloads). In other words: It’s easier to try, and far simpler for companies to roll straight into their apps. 

What it’s great at:
  1. Coding and debugging at scale: Pro remembers details from massive context dumps and often catches and corrects its own logic.
  2. Long-context planning: Handles multi-turn planning, where each new prompt builds on the last, and can steer big engineering jobs, such as rewriting or reorganizing an entire codebase.
  3. Multimodal reasoning: Solid performance across text, code, and images in the same prompt thread.

Gemini 2.5 Flash: The glue model with speed control

Gemini 2.5 Flash landed in mid-April as Google’s first hybrid-reasoning model—designed to be fast by default, but able to “pause and think” when a task gets tricky. It’s like developers got a thinking-budget knob (0–24,000 tokens) that trades cost and latency for extra brainpower: Leave it at 0 for 2.0-level speed, or bump it up when problems need multi-step logic. 

Google calls this its best price-to-performance option. Because most requests (known as “calls”) a developer sends to the model are lightweight (for tasks like routing, re-formatting, or quick look-ups), teams can keep 90 percent of their work with Flash at rock-bottom prices and reserve extra reasoning—and cost—for the rare tasks that require it, like drafting a complete product-feature spec from scattered meeting notes.

What it’s great at:
  1. Low-latency orchestration: Flash acts like a real-time dispatcher—cleaning up AI responses, deciding where each request should go (like a heavier-duty reasoning model for complex questions or an external API for fresh data), and tagging or sorting huge streams of data without slowing anything down. 
  2. Programmable reasoning control: Teams can dial up or down how much “thinking time” the model spends on each request—aka its inference depth. Less thinking time means fast, cheap answers for simple tasks; more thinking time lets the model pause, reason through several steps, and return a more thorough solution (at the cost of a few extra tokens and milliseconds). 
  3. Multimodal on a budget: Handles image input with surprisingly strong results—at a fraction of what Claude or GPT-4 charge.

What everyone at Every thinks

… about Gemini 2.5 Pro

It’s the new default inside Cursor—because it just works

“I love how powerful it is. Its large context window and reasoning capabilities mean I can dump in a lot of context ,and I feel pretty confident in using it well (remembering certain implementation details from some random file I gave it several messages ago, etc.)”—Danny Aziz, general manager of Spiral

Tool calls and comment spam are a problem

“Tool calls, at least within Cursor, are still a bit problematic—sometimes it generates code inline instead of applying changes, so you have to do it manually. That’s kinda annoying. Another annoying thing is the comments [explanatory lines throughout the file, which add clutter and degrades performance]—it adds a lot, and I always have to ask it to remove them. But other than those two things, it’s really great.”—Naveen Naidu, entrepreneur in residence

Create a free account to continue reading

The Only Subscription
You Need to Stay at the
Edge of AI

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Monologue Monologue
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders

Comments

You need to login before you can comment.
Don't have an account? Sign up!
Mark Laurence 7 months ago

A super interesting article, however when reading about the frontier reasoning models I always find myself wishing that there was more commentary about the incredible use cases they have for your 'average knowledge worker' and/or senior leaders.

Gigantic context windows + multi-turn, high-powered strategic thinking = super powers for desk workers and strategic leaders.

These models are not just great for coders.