April 23, 2026AgentsCodingInfrastructure

Anthropic admits Claude Code got dumber

Anthropic just published a postmortem explaining why Claude Code felt broken for most of March and April. The answer is three separate bugs stacked on top of each other, and the transparency of the writeup is the most interesting part.

Bug one: they quietly switched the default reasoning effort from high to medium in early March to reduce latency. Users immediately noticed Claude felt dumber. They’ve now reverted to xhigh on Opus 4.7 and high on other models. Bug two: a prompt caching optimization meant to clear old reasoning once per idle hour instead kept clearing it every turn, which is why Claude seemed to forget mid-conversation and burned through usage limits twice as fast. Fixed on April 10. Bug three: a system prompt telling Claude to keep replies under 25-100 words tanked coding quality by 3 percent across models. Reverted April 20.

Everything was fixed by v2.1.116 on April 20, and Anthropic reset usage limits for all paid subscribers on April 23 as make-good. They’re also committing to broader evals on system prompt changes and gradual rollouts. That last part matters: the reason the 25-100 word cap shipped is because nobody ran the full coding benchmark before deployment.

The meta story is harder to ignore. Anthropic publishing this the same day OpenAI drops GPT-5.5 topping every benchmark, including the coding ones Claude used to own, is not a coincidence. When your core users are screaming that your product got worse and your competitor ships a model that matches your premium tier on every axis, transparency is no longer optional. Anthropic had to get ahead of the narrative or lose the coding-agent category.

https://www.anthropic.com/engineering/april-23-postmortem
← Previous
GPT-5.5 kills the benchmark
Next β†’
Agent Vault stops agents from ever seeing credentials
← Back to all articles

Comments

Loading...
>_