June 24, 2026AgentsAgent-Operable

Gemini 3.5 Flash Can Now Use Your Computer

Google just folded computer use straight into Gemini 3.5 Flash. Not a separate model, not a research preview, the actual fast, cheap model you'd already reach for can now see a screen, reason about it, and click. Browser, mobile, desktop. It used to be a standalone Gemini 2.5 thing; now it's native to the workhorse.

That the model matters here is the cheap part. Computer use is the messiest, most token-hungry kind of agent work, you're looping over screenshots, deciding the next action, checking if it worked, over and over. Putting it inside Flash instead of a flagship is Google saying this should be a commodity capability, not a premium one. The blog points at OSWorld gains, the standard benchmark for agents driving real software.

Google clearly knows the obvious objection, which is that an agent clicking around your machine is a security nightmare. So they did targeted adversarial training against prompt injection, and shipped optional enterprise safeguards: ask the human before sensitive actions, kill the task automatically if it smells like an injection attack. Whether that holds up under real adversaries is the open question, but at least they're not pretending the risk isn't there.

You get it through the Gemini API and the Enterprise Agent Platform, with a demo running on Browserbase. The bigger picture: computer use is quietly going from special trick to default tool, the same way function calling did a couple years back. When the boring fast model can drive any app on screen, the list of things that still need a human in the loop gets shorter. https://blog.google/innovation-and-ai/models-and-research/gemini-models/introducing-computer-use-gemini-3-5-flash/
← Previous
Qualcomm Buys Modular: The Punch Aimed at CUDA
Next β†’
Qwen-AgentWorld: Train the Agent in a Dream
← Back to all articles

Comments

Loading...
>_