i-am-ai

Two models, two philosophies

Google dropped nine demo videos at I/O 2026 showcasing Gemini Omni and Gemini 3.5 Flash, and the most interesting thing isn't the capabilities—it's the strategic split.

Gemini Omni is Google's answer to "what if reasoning met creation?" It's a multimodal model that takes images, audio, video, and text as input and generates high-quality video grounded in Gemini's knowledge graph. You edit through conversation, and each instruction builds on the last while maintaining character consistency and physics.

Gemini 3.5 Flash, meanwhile, is built for agentic workflows—complex, multi-step tasks that run under supervision. It's the model powering Google's new Antigravity harness and the foundation for persistent agents that operate 24/7.

This is a deliberate portfolio play. Google is betting that the frontier splits into at least two distinct workloads: generative creation (where latency and iteration matter more than raw intelligence) and long-horizon execution (where reliability and tool-use dominate).

Omni: conversational video editing that actually persists state

The Gemini Omni demos lean hard into iterative editing. You start with a video—say, a hand holding a sculpture—and prompt: "Make the sculpture out of bubbles." The model regenerates the scene with bubble physics.

Then you iterate: "Dim the lights. Put a black and white checkerboard room inside a glass sphere that floats above the hand, inside it contains a recursive representation of the same hand holding the sphere."

What's noteworthy here is state persistence across turns. This isn't just prompt-to-video generation; it's a multi-turn dialogue where the model remembers what came before, maintains scene coherence, and layers edits without manual masking or keyframing.

The demos show:

Action reimagining: Change what's happening in a video you shot. Add characters, transform environments, or make a violinist's performance happen in a different setting entirely.
Recursive edits: Build complexity across multiple conversational turns without losing the thread.
Physics and consistency: The model holds up physics (bubbles behave like bubbles) and character identity (your violinist stays the same person even as the world changes).

This is closer to a creative assistant than a one-shot generator. The workflow isn't "prompt → wait → accept." It's "prompt → refine → redirect → refine again." That's a different UX contract, and it hints at Google's belief that generative video tools will look more like Figma than Midjourney.

3.5 Flash: speed meets frontier intelligence for agents

The 3.5 Flash demos are less flashy but arguably more strategically important. Google describes it as delivering "intelligence that rivals large flagship models on multiple dimensions, at the speeds you have come to expect from the Flash series."

That balance—flagship-level reasoning at Flash-level latency—is what makes it suitable for agentic workflows. The demos show:

Agentic task execution at scale

Powered by Antigravity (Google's agent harness), 3.5 Flash executes multi-step workflows like automatically renaming and categorizing unstructured assets based on dynamic criteria. The model spawns collaborative subagents, each handling a piece of the pipeline, while maintaining supervision.

This is the industrialization of agents: not one monolithic model trying to do everything, but a coordinated swarm of specialized subagents orchestrated by a harness.

Generative UI and interactive graphics

One demo shows 3.5 Flash generating different UX approaches for a checkout flow in 60 seconds on AI Studio. Another shows Search building an interactive visual explanation of Gyroid patterns on the fly—complete with manipulable 3D graphics.

Google is using 3.5 Flash to build custom interfaces per query. Instead of returning a static SERP, Search constructs dashboards, trackers, mini-apps, and visual simulations tailored to your question. This is generative UI in production, and it's free for everyone this summer.

Information agents and persistent tasks

The new "information agents" run 24/7 in the background, reasoning across sources to surface updates at the right moment. Example: monitoring whether your favorite athletes announce sneaker collabs or signature drops, then sending a digest with links.

For ongoing tasks—wedding planning, fitness routines—Search builds custom experiences you can return to. Google AI Pro and Ultra subscribers will get access to create their own with Antigravity starting in the U.S. in the coming months.

Gemini Spark: the personal agent layer

Gemini Spark is Google's answer to the "personal AI agent" category. It runs on Gemini 3.5, uses the Antigravity harness, and is deeply integrated with Workspace (Gmail, Docs, Slides).

The demo: create a list of nut-free snacks, then add them to Instacart. Simple task, but it shows the integration depth. Spark isn't just answering questions—it's taking action across third-party services under your direction.

It's available now to Google AI Ultra subscribers in the U.S. The positioning is clear: this is Google's Copilot competitor, but with tighter ecosystem lock-in and a 24/7 operational model.

What the demos don't show

A few things are conspicuously absent:

Error recovery: What happens when an agentic workflow fails midway? Do subagents retry? Rollback? Escalate to the user?
Cost and latency: 3.5 Flash is positioned as fast, but what's the actual token cost for spawning multiple subagents via Antigravity? What's the cold-start time for an information agent?
Safety guardrails: How does Google prevent Spark from taking unintended actions when integrated with Gmail and third-party APIs? What's the approval flow?
Model size and deployment: Is Omni running client-side for Plus subscribers or entirely server-side? What's the infrastructure footprint for 24/7 agents?

These aren't dealbreakers, but they're the questions that separate demos from production-ready systems.

The multi-model strategy is deliberate

Google could have built one do-everything model. Instead, they shipped two models with distinct design centers:

Omni optimizes for creative iteration and multimodal generation.
3.5 Flash optimizes for agentic reliability and long-horizon execution.

This mirrors the broader industry trend: there is no single "best" model, only models optimized for specific workloads.

Anthropic has Claude 3.5 Sonnet (general intelligence) and Claude 3 Haiku (speed). OpenAI has GPT-4 variants and likely more specialization coming. Google is now explicitly segmenting by use case rather than trying to win every benchmark with one flagship.

The risk? Fragmentation. Developers now have to choose which model to use for which task, and Google has to maintain multiple training pipelines, inference stacks, and API surfaces.

The upside? Better price-performance ratios and faster iteration. If you only need agentic execution, you don't pay for video generation overhead. If you only need creative tools, you don't pay for tool-use infrastructure.

Availability and rollout

Gemini Omni Flash is rolling out to Google AI Plus, Pro, and Ultra subscribers globally via the Gemini app and Google Flow. It's also available at no cost on YouTube Shorts and the YouTube Create App. Developer and enterprise API access is coming in the next few weeks.

Gemini 3.5 Flash is generally available now via Google Antigravity, AI Studio, Android Studio, and Gemini Enterprise. It's the default model in the Gemini app and AI Mode in Search globally.

The free tier for generative UI in Search is notable—Google is giving away compute-intensive features (custom dashboards, interactive graphics) to drive Search engagement. That's a bet that the moat is in integration and ecosystem, not in model access.

What this means for the agent race

The agent wars are heating up. Google's Antigravity + 3.5 Flash stack is now competing with:

Anthropic's tool use and computer control demos
OpenAI's rumored agent-focused releases
Smaller players like Adept, MultiOn, and open-source harnesses like LangGraph and AutoGPT

Google's advantage: vertical integration. Spark isn't just an agent—it's an agent with native Workspace access, Search integration, and eventually Android system-level hooks. That's hard to replicate.

The question is whether developers trust Google to maintain stable APIs and not deprecate agent primitives when the next shiny model drops. Google's developer platform track record is... mixed.

But if 3.5 Flash + Antigravity delivers on the promise of reliable, supervised agentic workflows at scale, this could be the stack that finally makes "AI agents" more than vaporware.

Two models, two philosophies

Google dropped nine demo videos at I/O 2026 showcasing Gemini Omni and Gemini 3.5 Flash, and the most interesting thing isn't the capabilities—it's the strategic split.

Omni: conversational video editing that actually persists state

The demos show:

Action reimagining: Change what's happening in a video you shot. Add characters, transform environments, or make a violinist's performance happen in a different setting entirely.
Recursive edits: Build complexity across multiple conversational turns without losing the thread.
Physics and consistency: The model holds up physics (bubbles behave like bubbles) and character identity (your violinist stays the same person even as the world changes).

3.5 Flash: speed meets frontier intelligence for agents

That balance—flagship-level reasoning at Flash-level latency—is what makes it suitable for agentic workflows. The demos show:

Agentic task execution at scale

This is the industrialization of agents: not one monolithic model trying to do everything, but a coordinated swarm of specialized subagents orchestrated by a harness.

Generative UI and interactive graphics

Information agents and persistent tasks

Gemini Spark: the personal agent layer

Gemini Spark is Google's answer to the "personal AI agent" category. It runs on Gemini 3.5, uses the Antigravity harness, and is deeply integrated with Workspace (Gmail, Docs, Slides).

It's available now to Google AI Ultra subscribers in the U.S. The positioning is clear: this is Google's Copilot competitor, but with tighter ecosystem lock-in and a 24/7 operational model.

What the demos don't show

A few things are conspicuously absent:

Error recovery: What happens when an agentic workflow fails midway? Do subagents retry? Rollback? Escalate to the user?
Cost and latency: 3.5 Flash is positioned as fast, but what's the actual token cost for spawning multiple subagents via Antigravity? What's the cold-start time for an information agent?
Safety guardrails: How does Google prevent Spark from taking unintended actions when integrated with Gmail and third-party APIs? What's the approval flow?
Model size and deployment: Is Omni running client-side for Plus subscribers or entirely server-side? What's the infrastructure footprint for 24/7 agents?

These aren't dealbreakers, but they're the questions that separate demos from production-ready systems.

The multi-model strategy is deliberate

Google could have built one do-everything model. Instead, they shipped two models with distinct design centers:

Omni optimizes for creative iteration and multimodal generation.
3.5 Flash optimizes for agentic reliability and long-horizon execution.

This mirrors the broader industry trend: there is no single "best" model, only models optimized for specific workloads.

The risk? Fragmentation. Developers now have to choose which model to use for which task, and Google has to maintain multiple training pipelines, inference stacks, and API surfaces.

Availability and rollout

Gemini 3.5 Flash is generally available now via Google Antigravity, AI Studio, Android Studio, and Gemini Enterprise. It's the default model in the Gemini app and AI Mode in Search globally.

What this means for the agent race

The agent wars are heating up. Google's Antigravity + 3.5 Flash stack is now competing with:

Anthropic's tool use and computer control demos
OpenAI's rumored agent-focused releases
Smaller players like Adept, MultiOn, and open-source harnesses like LangGraph and AutoGPT

The question is whether developers trust Google to maintain stable APIs and not deprecate agent primitives when the next shiny model drops. Google's developer platform track record is... mixed.

But if 3.5 Flash + Antigravity delivers on the promise of reliable, supervised agentic workflows at scale, this could be the stack that finally makes "AI agents" more than vaporware.

Gemini Omni and 3.5 Flash: Google's multi-model bet on creation and agentic execution

Two models, two philosophies

Omni: conversational video editing that actually persists state

3.5 Flash: speed meets frontier intelligence for agents

Agentic task execution at scale

Generative UI and interactive graphics

Information agents and persistent tasks

Gemini Spark: the personal agent layer

What the demos don't show

The multi-model strategy is deliberate

Availability and rollout

What this means for the agent race

Gemini Omni and 3.5 Flash: Google's multi-model bet on creation and agentic execution

Two models, two philosophies

Omni: conversational video editing that actually persists state

3.5 Flash: speed meets frontier intelligence for agents

Agentic task execution at scale

Generative UI and interactive graphics

Information agents and persistent tasks

Gemini Spark: the personal agent layer

What the demos don't show

The multi-model strategy is deliberate

Availability and rollout

What this means for the agent race