AlphaEvolve, one year later, looks a lot less like a paper trick
A year ago AlphaEvolve was the kind of DeepMind announcement you'd file under "neat science demo" — Gemini-powered evolutionary coding agent, beats Strassen on 4x4 matmul, advances the kissing number problem in 11 dimensions. Cool, but most lab demos die in the press release. May 7, 2026, DeepMind dropped the impact follow-up. The receipts are something else.
Healthcare: 30% reduction in variant detection errors via DeepConsensus. Energy: AC Optimal Power Flow feasible-solution rate jumped from 14% to 88%. Earth science: 5% accuracy gain across 20 natural disaster categories. Quantum: 10x lower error circuits enabling Willow processor demos. And the part Google won't shut up about internally — 20% write amplification reduction in Spanner, 9% storage footprint cut via compiler optimization, 23% faster Gemini training kernels, 32.5% speedup on FlashAttention.
The customer logos are the part that finally makes this real. Klarna doubled transformer training speed. FM Logistic saved 15,000+ km annually with 10.4% routing efficiency gains. Schrödinger got ~4x speedup on machine-learning force fields, the kind of thing that turns a six-month drug screening run into six weeks. WPP added 10% accuracy to campaign optimization. These aren't research benchmarks — these are P&L line items.
The thesis worth carrying around: a coding agent with an evaluator, run for a year, beats a hundred PhDs at finding the inner loop optimization. The bottleneck wasn't intelligence, it was patience. Tell the agent the metric, give it compute, walk away. AlphaEvolve isn't proof that AI does science — it's proof that any problem with an editable file plus a measurable number can become an automated search.
Read it: https://deepmind.google/blog/alphaevolve-impact/
← Back to all articles
Healthcare: 30% reduction in variant detection errors via DeepConsensus. Energy: AC Optimal Power Flow feasible-solution rate jumped from 14% to 88%. Earth science: 5% accuracy gain across 20 natural disaster categories. Quantum: 10x lower error circuits enabling Willow processor demos. And the part Google won't shut up about internally — 20% write amplification reduction in Spanner, 9% storage footprint cut via compiler optimization, 23% faster Gemini training kernels, 32.5% speedup on FlashAttention.
The customer logos are the part that finally makes this real. Klarna doubled transformer training speed. FM Logistic saved 15,000+ km annually with 10.4% routing efficiency gains. Schrödinger got ~4x speedup on machine-learning force fields, the kind of thing that turns a six-month drug screening run into six weeks. WPP added 10% accuracy to campaign optimization. These aren't research benchmarks — these are P&L line items.
The thesis worth carrying around: a coding agent with an evaluator, run for a year, beats a hundred PhDs at finding the inner loop optimization. The bottleneck wasn't intelligence, it was patience. Tell the agent the metric, give it compute, walk away. AlphaEvolve isn't proof that AI does science — it's proof that any problem with an editable file plus a measurable number can become an automated search.
Read it: https://deepmind.google/blog/alphaevolve-impact/
Comments