DALL-E/Every illustration.

Google Goes All In on AI

The company has every possible advantage, but can they use it?

33

By my count, Google launched 23 products at yesterday’s developer conference. 

Well, launched isn’t the right word. It was more akin to when your cousin starts posting the back of some dude’s head in her Instagram Stories and you realize she has a boyfriend. In that spirit, let’s say Google soft-launched 23 products, even though it only actually released two. One was an open-source large language model, and the other was the integration of its Gemini 1.5 Pro model into its chatbot mobile app. Everything was about artificial intelligence: Executives said “AI” 120 times in the space of two hours. If AI was the most common term, I would venture to guess that the second most common was “coming soon.”

What yesterday’s presentation did demonstrate was power. Google’s search engine, which is arguably the best business model ever invented, throws off an incredible amount of excess capital. That cash cannon is now firmly pointed at AI. The company previewed new hardware, new AI integrations into consumer products, new chips, and new models.

When a technology paradigm shift is underway, it is not yet clear what will work and where the profit will pool. Startups have to pick their bets, while big companies have the benefit of being able to bet on everything at once. Google is in this happy latter position. CEO Sundar Pichai will invest in as much as he can and hope that it doesn’t disrupt the search engine cash cow too much.

The company is throwing a bunch of stuff at the wall, but these are the things I think will matter the most:

  1. Agentic search: Google is aiming to go from normal search to where “Google does the googling for you.” 
  2. Google in Android: Google is creating a virtual assistant that will come bundled at the operating system level.
  3. Gemini 1.5 everywhere: From its Google Sheets to Google Photos to Gmail products, Google is integrating the ability to query, search, and summarize your data. 

Let’s go through each of these in more depth, talk about where power will accumulate, and look at what this all means for Google’s competition.

Google does the googling

Google’s vision is to transform its search engine into what I call an answer-and-action engine. When you search for “Thai food near me,” you want this information, but you really just want to stuff your face with pad thai as quickly as possible. Until now, you might have Googled the question, scanned multiple restaurants, clicked through some sites, read menus, then finally pulled up Doordash to order the food. Its vision is, in effect, that every workflow that starts at the search bar ends at the search bar. “Google does the googling” means that an agentic search agent understands the context of your questions, answers them, and then does the actions for you, removing the middle steps in the process. It’s your personal agent, not just your guide to what’s on the internet.

The search query can be text-first or vision-first, using the multi-modal models Google announced at the conference. Just point the camera at something and ask a question.

Source: Google.

Google’s agent engine could automate most of the ways that we currently use the internet. While Google started hinting at this idea last year, I am surprised at how quickly it is going all in. Yesterday it was announced that everyone will be able to get results like these “soon.” Investors should be worried: If Google pulls this off for users, it might be terrible for its existing revenue. Will Google be able to stuff the same amount of ads into this type of software? I doubt it. Maybe the hope is that by moving from search to answer/action, it will increase the overall usage of its products and make up the difference that way. If Google is cannibalizing its existing business to plow forward on AI, that might hurt investors in the short term until a new business materializes. 

More broadly, it could be that human beings moving pixels from one box to another was a blip in the long arc of technological progress. We spend an inordinate amount of time transcribing data and then using that data to perform knowledge labor—a product known in the industry as workflow software. Up to this point, workflow software was made by coding specialized flows of user actions. The magic of LLMs is that they are generalizable—there’s no need for new pieces of software for each workflow. Instead, an action engine can just do work for you. As one founder who has raised $50 million for their AI startup put it to me, “LLMs give you features for free.” 

Do Androids dream of Sundar’s sheep?

There are three ways for an LLM to learn workflows. 

First, it can use a multi-modal model to watch you, FBI-style, by snooping in your browser. It’ll learn what to do by watching videos or images of user actions that you do in Chrome. 

Second, it can learn from you telling it what to do (this is sort of what a prompt is). 

Finally, and most interestingly, the LLM can receive permission to access your device and learn everything you do in context. The AI app transforms into a meta-layer that learns and automates away all the stuff you used to do in individual apps. Google made this a big focus in its event yesterday: “Soon, you’ll be able to bring up Gemini's overlay on top of the app you're in to easily use Gemini in more ways.”

Said in non-corporate speak: Google is integrating LLMs into the Android operating system. 

This third use case is a disaster for many software applications. During Apple’s WWDC conferences in the early years of the iPhone, each yearly software release would destroy a huge number of startups as the company integrated its utility directly into the phones themselves. The end state of today’s AI shift might mimic what happened in the 2010s tech boom, when the startups that produced the largest amounts of value used workflows to build network effects. The value of Airbnb isn’t the software, it’s the network of homeowners and guests. AI’s ability to create workflow automations on the fly just increases the importance of getting distribution advantages to build network effects.

Unlike OpenAI, Google and Apple each have their own operating systems and web browsers. They can watch people’s work in their browsers and what they do in their applications via the OS. It’s likely no accident that OpenAI released its most powerful model the day before Google announced its own intentions. The ChatGPT maker desperately needs both network effects from consumers relying on its chatbot and a way to learn user actions on its desktop app; otherwise, power will accumulate higher up in the tech stack, at the operating system or browser level.

Gemini everywhere

The last, most important update is that Google is integrating LLMs into all of its most popular applications. This should surprise no one. Sprinkling delightful bits of intelligence into stuff people pay for is AI 101.

There is one slight issue—it doesn’t appear to work. Yesterday, I posed questions to a Gemini chatbot embedded into my Gmail account and received error after error. Here’s one quick example: I get too many emails and always lose track of my flights. I have one coming up this week and hoped Gemini could help.

Subscribe to read the full article

Ideas and Apps to
Thrive in the AI Age

The essential toolkit for those shaping the future

"This might be the best value you
can get from an AI subscription."

- Jay S.

Mail Every Content
AI&I Podcast AI&I Podcast
Cora Cora
Sparkle Sparkle
Spiral Spiral

Join 100,000+ leaders, builders, and innovators

Community members

Already have an account? Sign in

What is included in a subscription?

Daily insights from AI pioneers + early access to powerful AI tools

Pencil Front-row access to the future of AI
Check In-depth reviews of new models on release day
Check Playbooks and guides for putting AI to work
Check Prompts and use cases for builders