OneManCompany: agents organized like a real company beats SOTA by 15 points
Fresh on arXiv this week: OneManCompany, or OMC, takes the multi-agent system and builds it as an actual organization. Not a swarm, not a fixed pipeline of "researcher → coder → reviewer" — an honest-to-god company structure with portable agent identities, a recruitment market, and a hierarchical Explore-Execute-Review loop. The headline number: 84.67% on PRDBench, beating prior state-of-the-art by 15.48 percentage points.
Three pieces make it work. First, Talents — agents aren't just instances of an LLM, they're identities that travel with their skills, tools, and history. Second, the Talent Market — when the org hits a capability gap, it goes hire instead of overloading an existing agent. Third, the E²R framework — Explore, Execute, Review, organized as a tree search. Planning, execution, and evaluation aren't separate phases; they're the same loop run at different depths. The org reorganizes itself when the work demands it, the way a real company would.
Why this paper matters more than the average multi-agent paper. Most multi-agent work right now copies one of two templates: either a fixed role assignment (researcher, coder, reviewer) or a debate format (two agents argue, judge decides). Both ceilings are visible. OMC is doing something different — it's saying the org structure is itself part of the search, and a system that can hire, fire, and reshuffle Talents during execution will dominate one with a fixed cast. The 15-point jump on PRDBench is the empirical claim that the structure-as-search idea is paying off.
Who should read it. Anyone building multi-agent systems for product work — PRD generation, codebase changes, research synthesis — where the work doesn't fit neatly into a fixed role-set. The OMC abstraction also pairs cleanly with the Skills synthesis happening across Anthropic, Google's agents-cli, and EvanFlow: Talents are basically Skills with identity persistence and a market mechanism. If Skills are the unit of capability, Talents are the unit of accountability. No GitHub repo at submission time, but the formalization is detailed enough to reimplement.
Paper: arxiv.org/abs/2604.22446. Authors: Zhengxu Yu, Yu Fu, Zhiyuan He, Yuxuan Huang, Lee Ka Yiu, Meng Fang, Weilin Luo, Jun Wang. Submitted April 24, 2026.
← Back to all articles
Three pieces make it work. First, Talents — agents aren't just instances of an LLM, they're identities that travel with their skills, tools, and history. Second, the Talent Market — when the org hits a capability gap, it goes hire instead of overloading an existing agent. Third, the E²R framework — Explore, Execute, Review, organized as a tree search. Planning, execution, and evaluation aren't separate phases; they're the same loop run at different depths. The org reorganizes itself when the work demands it, the way a real company would.
Why this paper matters more than the average multi-agent paper. Most multi-agent work right now copies one of two templates: either a fixed role assignment (researcher, coder, reviewer) or a debate format (two agents argue, judge decides). Both ceilings are visible. OMC is doing something different — it's saying the org structure is itself part of the search, and a system that can hire, fire, and reshuffle Talents during execution will dominate one with a fixed cast. The 15-point jump on PRDBench is the empirical claim that the structure-as-search idea is paying off.
Who should read it. Anyone building multi-agent systems for product work — PRD generation, codebase changes, research synthesis — where the work doesn't fit neatly into a fixed role-set. The OMC abstraction also pairs cleanly with the Skills synthesis happening across Anthropic, Google's agents-cli, and EvanFlow: Talents are basically Skills with identity persistence and a market mechanism. If Skills are the unit of capability, Talents are the unit of accountability. No GitHub repo at submission time, but the formalization is detailed enough to reimplement.
Paper: arxiv.org/abs/2604.22446. Authors: Zhengxu Yu, Yu Fu, Zhiyuan He, Yuxuan Huang, Lee Ka Yiu, Meng Fang, Weilin Luo, Jun Wang. Submitted April 24, 2026.
Comments