June 3, 2026API Benchmark

MAI-Thinking-1: 1 Trillion Parameters, 35B Active, Beats Claude Sonnet 4.6 Head-to-Head

The second model Microsoft announced at Build 2026 is MAI-Thinking-1, a sparse Mixture-of-Experts architecture with roughly 1 trillion total parameters but only 35 billion active on any given query. That is the architecture that lets you get frontier performance while keeping inference costs from going vertical.

The reasoning benchmarks are strong: 97.0% on AIME 2025, 94.5% on AIME 2026. Microsoft also ran 1,276 blind human head-to-head evaluations against Claude Sonnet 4.6 and claimed a win across the board. The 256k token context window means it can hold large codebases, long contracts, or extended research documents in a single pass—relevant for any agent doing sustained work on complex documents.

Currently in private preview on Microsoft Foundry, with public preview coming to the MAI Playground at https://playground.microsoft.ai/chat. Microsoft released both models simultaneously—MAI-Code-1-Flash for the coding-specific use case and MAI-Thinking-1 for deep reasoning. The more interesting implication: Microsoft now has frontier models it does not have to license from OpenAI.

← Previous

Microsoft Built a Coding Model That Uses 60% Fewer Tokens—and Still Beats Claude

Nemotron 3 Ultra: NVIDIA's 550B Open-Weight Model Was Built for Agents, Not Chat

← Back to all articles

MAI-Thinking-1: 1 Trillion Parameters, 35B Active, Beats Claude Sonnet 4.6 Head-to-Head

Related Articles

Comments