May 1, 2026CodingAgentsMonitoring

Anthropic's OpenClaw Filter Burns Real Quota

Hacker News #15 today, 1,163 points: Claude Code refuses requests or burns through your quota the moment your repo, commit message, or prompt mentions 'OpenClaw.' Reproduced by user abdullin and now by half the comments thread. Single small prompt with the magic string in a commit consumes session-level credits.

Mechanism is unclear from the outside but obvious in effect. Either a regex on the inbound payload or a classifier that fires before billing accounting normalizes — either way the user pays. The filter triggers on plausibly innocent strings too: a JSON schema name like {'schema': 'openclaw.inbound_meta.v1'} is enough. CTF challenges with similar trip strings against other models don't trigger this, so it's not a generic safety filter — it's an Anthropic-specific block on a specific competitor's namespace.

This is the third 'agent harness reliability' incident in nine days. Cursor deleted a production database with Opus 4.6 on April 23. Claude Code's HERMES.md billing routing leaked custom prompts on April 30. Now this. The pattern: harness-level logic with no observability, no opt-out, and direct billing exposure. The user can't see why a request burned credits, can't appeal, and can't know what other strings trigger the same penalty.

For anyone running production agents on Claude Code: assume there are more of these filters, assume they're more aggressive than safety alone explains, and assume the billing meter doesn't refund what gets burned in transit. The harness is the product. The model is increasingly secondary to the wrapper around it — and the wrapper has rules nobody outside Anthropic can audit.

HN thread: https://news.ycombinator.com/item?id=47963204
← Previous
Sam Altman Eats His Words on Cyber Access
Next →
MCPHunt Catches MCP Leaking Credentials Without a Bad Actor
← Back to all articles

Comments

Loading...
>_