Microsoft Built a Coding Model That Uses 60% Fewer Tokens—and Still Beats Claude
Microsoft does not want to depend on OpenAI forever. At Build 2026, it dropped MAI-Code-1-Flash, its first fully in-house coding model built on clean and appropriately licensed data. The headline number: 51.2% on SWE-Bench Pro, versus Claude Haiku 4.5's 35.2%. That is a 16-point gap on the benchmark that matters most for agentic coding.
The efficiency story is arguably more interesting than the raw performance. MAI-Code-1-Flash uses adaptive reasoning—concise for easy tasks, extended thinking budget only when warranted. End result: it resolves coding problems using up to 60% fewer tokens than comparable models. At enterprise scale, that is the difference between a Copilot bill that is manageable and one that is not.
Rolling out now to GitHub Copilot individual users in VS Code through the model picker—no extra setup. You can also try it directly at https://playground.microsoft.ai/chat. Whether Microsoft can keep feeding this model with enough training data to stay competitive is the real long-term question, but the first version clears the bar.
← Back to all articles
The efficiency story is arguably more interesting than the raw performance. MAI-Code-1-Flash uses adaptive reasoning—concise for easy tasks, extended thinking budget only when warranted. End result: it resolves coding problems using up to 60% fewer tokens than comparable models. At enterprise scale, that is the difference between a Copilot bill that is manageable and one that is not.
Rolling out now to GitHub Copilot individual users in VS Code through the model picker—no extra setup. You can also try it directly at https://playground.microsoft.ai/chat. Whether Microsoft can keep feeding this model with enough training data to stay competitive is the real long-term question, but the first version clears the bar.
Comments