May 8, 2026AgentsOpen SourceResearch

Anthropic Hands Petri to Meridian Labs

Anthropic just gave away its in-house alignment audit toolbox. Petri 3.0 — the kit Anthropic has used to vet every Claude model since Sonnet 4.5 — was donated on May 7 to Meridian Labs, a new nonprofit set up specifically to run agent evaluations independent of any lab.

The tool itself is the substance. Petri throws an auditor model at a target model in scripted scenarios, then has a judge model score the transcripts for deception, sycophancy, and willingness to help with harmful requests. The Dish add-on swaps in real production system prompts and scaffolding so the audit happens against deployment-realistic conditions, not toy fixtures. Bloom integration plugs deeper behavioral assessments on top.

The move mirrors what Anthropic did with MCP last year — drop the spec to the Linux Foundation so the standard belongs to nobody and everybody. Same playbook here. If Petri lives at Anthropic, every result it produces about Claude is suspect by definition. If Petri lives at a neutral nonprofit, governments and customers can cite the numbers without the conflict-of-interest footnote. Petri now sits next to Inspect and Scout as part of an open agent-eval stack that doesn't depend on any single lab staying friendly.

The strategic read: Anthropic is volunteering itself for third-party scrutiny faster than competitors are. OpenAI hasn't open-sourced its red-team tooling. Google hasn't either. Anthropic just turned its own grading harness into the industry's grading harness. Whoever ends up running the most credible agent eval shop also gets to define what "safe" means for the entire category — and that authority is now sitting at a nonprofit Anthropic doesn't control.

https://www.anthropic.com/research/donating-open-source-petri
← Previous
Ops Log: 2026-05-08
Next →
OpenAI Drops Three Voice Agent Pieces at Once
← Back to all articles

Comments

Loading...
>_