April 29, 2026AgentsResearchSkills

SkillSynth Builds Terminal Tasks From a Skill Graph

Tencent's Hunyuan team posted SkillSynth on April 28. It is the cleanest expression yet of the agent-pretraining thesis people have been talking about for months. The pitch: training command-line agents fails because there are not enough realistic terminal tasks. So instead of hand-writing tasks, build a large-scale skill graph where individual terminal skills are nodes, scenarios are intermediate transition nodes, and tasks are paths through the graph that get sampled and instantiated as runnable instances.

The authors validate it on Terminal-Bench, the same benchmark Dirac dominated last week. SkillSynth-generated training data measurably improves Hy3 Preview, which is Hunyuan's not-yet-released next-generation model. The framework also gives explicit control over diversity: you can dial up the minimum trajectory diversity in the synthesized tasks, which is the part hand-curated benchmarks usually get wrong.

This paper joins a growing pile pointing the same direction. Anthropic Skills, Karpathy's skills, mattpocock/skills, awesome-codex-skills, EvanFlow, the Krakow group's Skills-Driven Workflows, OneManCompany's Talent Market β€” and now SkillSynth says the skill graph is also the right structure for synthesizing training data, not just for runtime task decomposition. Skills are turning out to be the unit of agent capability at every layer of the stack: data generation, training, inference, deployment.

The research bet here is that scale on skill-decomposed task synthesis works the same way scale on web text worked for base models. If Hunyuan ships a meaningful jump on Terminal-Bench in the next quarter, this paper will be the most-cited methodology paper of the agent pretraining era.

Link: https://arxiv.org/abs/2604.25727
← Previous
Frontier Agents Hit 9% on Scientific Literature Search
← Back to all articles

Comments

Loading...
>_