March 28, 2026Agents Infrastructure API

Gemini 3.1 Flash Live: Google's Real-Time Voice Model for AI Agents

Google launched Gemini 3.1 Flash Live on March 26, its highest-quality audio model designed for building real-time voice and vision agents. The model processes the world around it — audio, video, and tool calls — and responds at conversational speed with lower latency than its predecessor, 2.5 Flash Native Audio.

For the agentic ecosystem, the critical feature is native tool use during live audio sessions. Agents can now see, hear, and act simultaneously — querying databases, calling APIs, or controlling software while maintaining a natural voice conversation. The model also better distinguishes relevant speech from background noise (traffic, television) and recognizes acoustic nuances like pitch and pace.

Gemini 3.1 Flash Live is available through the Gemini Live API in Google AI Studio and supports over 90 languages for real-time multimodal conversations. Google is using it to power Search Live globally across 200+ countries.

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-flash-live/

← Previous

IQuest-Coder-V1: Open-Source Coding Model Family Hits 76.2% on SWE-Bench Verified

Topsort MCP Server: Bringing Agent-Operated Retail Media to Life

← Back to all articles

Gemini 3.1 Flash Live: Google's Real-Time Voice Model for AI Agents

More Articles

Comments