April 18, 2026Research Agents Framework

Researchers Dissected Claude Code Down to the queryLoop, Found 7 Safety Layers

A team from MBZUAI VILA Lab and University College London dropped a paper called "Dive into Claude Code: The Design Space of Today's and Future AI Agent Systems". 177 upvotes on HuggingFace, top of the agent papers today. They reverse engineered the publicly available TypeScript source and built a full architectural map.

Headline finding: only 1.6% of the codebase is the actual reasoning loop. The other 98.4% is operational infrastructure. Permission system, context compaction, tool routing, recovery, hooks. The lesson is uncomfortable for framework builders. The model already knows how to think. What it needs is not more scaffolding, it's more plumbing.

Seven safety layers stacked. Tool prefiltering, deny-first rule evaluation, seven permission modes, an ML classifier for auto-mode, shell sandboxing, no-restoration-on-resume, and 27 hook events. The paper calls it defense in depth. They also catalog four extension mechanisms — hooks, skills, plugins, MCP servers — sorted by how much context budget they cost.

The really juicy part is the comparison with OpenClaw, an open source agent system. Same questions, completely different architecture. Claude Code does per-action permission checks. OpenClaw does perimeter access control. Same problem, two valid answers, different deployment assumptions.

Paper: https://huggingface.co/papers/2604.14228

Best part: they end with the long-term capability paradox. Anthropic's own research shows developers using AI score 17% lower on comprehension. The architecture optimizes for short-term capability amplification but doesn't have explicit mechanisms for long-term human improvement. That's the open problem nobody is solving and probably the next big design vector for agent systems.

← Previous

Microsoft Puts Agents on the Windows 11 Taskbar via MCP

CraftBot: A Self-Hosted AI Assistant That Doesn't Phone Home

← Back to all articles

Researchers Dissected Claude Code Down to the queryLoop, Found 7 Safety Layers

Related Articles

Comments