Gemini exists and Google is finally serious

Google DeepMind launched Gemini. Three versions: Ultra (the largest), Pro (the workhorse), Nano (for on-device). The pitch: multimodal from the ground up. Not a text model with vision bolted on. A model that understands text, images, video, and code natively.

The demo video was impressive. A person showing objects to a camera, and Gemini responding in real time. Identifying drawings. Playing games with visual input. Reasoning about images as they changed.

Then people discovered the demo was edited. Not real-time. The interactions were spliced together from separate prompts. The “live” feel was produced in post-production.

That’s disappointing. But the underlying capability is real, even if the presentation was misleading.

The Google problem

Google has had a weird year. They built the transformer architecture that powers every modern language model (the “Attention Is All You Need” paper came from Google Brain). They built BERT. They built PaLM. They have DeepMind, which created AlphaFold and AlphaGo. By any measure, Google has more AI talent and research output than any other company.

And yet ChatGPT ate their lunch. OpenAI built a product on Google’s research and captured the public imagination in a way Google never did. When your mom is using ChatGPT and has never heard of PaLM, you’ve lost the narrative.

Gemini is Google’s response. Not just technically (the model is genuinely capable) but narratively. Google is saying: we’re here. We’re competing. We’re the company that invented the transformer and we’re not ceding this market.

The competition

The real competition starts now. For the past year, OpenAI has had the field mostly to itself. ChatGPT and GPT-4 defined the category. Anthropic provided an alternative but didn’t challenge for the lead. Meta released open-source models that influenced the market but didn’t produce a flagship consumer product.

Gemini changes the dynamic because Google has distribution. Android. Chrome. Search. Gmail. YouTube. Workspace. Google can put Gemini into products used by billions of people. OpenAI has to convince people to come to them. Google can bring AI to where people already are.

That distribution advantage is enormous. If Gemini Pro gets embedded in Gmail and Google Docs and Search, a billion people will interact with it without choosing to. That’s a different adoption model than “go to chat.openai.com and sign up.”

The misleading demo problem

The edited demo video is a problem because it undermines trust at exactly the moment Google needs to build it. People who saw the demo and then learned it was misleading are now skeptical of Google’s AI claims.

DeepMind does incredible research. Google has incredible engineering. But the marketing created a gap between expectation and reality that didn’t need to exist. The model is good. Show it being good. Don’t manufacture a “live” interaction that wasn’t live.

Trust compounds in both directions. Build trust slowly and it accumulates. Break it once and the recovery is expensive.

What this means

The AI field now has three major players: OpenAI, Google, and Anthropic. Meta is a significant force in open-source. Mistral is the European contender. The competition is real, well-funded, and accelerating.

For users, competition is good. Better models. Lower prices. More options. The monopoly risk that existed when ChatGPT was the only option is fading.

For the industry, competition introduces new risks. Speed over safety. Demo polish over honest capability. The pressure to ship before it’s ready.

Gemini is good. Google is serious. The race is on. I just wish they hadn’t faked the demo.

Related thinking: