April 11, 2026ResearchOpen SourceAgents

Tencent Open-Sources HY-Embodied — A 2B Brain for Robot Agents

Tencent just open-sourced HY-Embodied-0.5, and it might be the most practical embodied AI model released this year. Two variants: a compact 2B model that runs on edge devices, and a 32B model for heavy reasoning. The 2B version already outperforms models of similar size across 16 benchmarks.

The architecture is clever. They use Mixture-of-Transformers (MoT) with latent tokens for modality-specific computing. In plain English: the model efficiently processes different types of input — text instructions, visual scenes, spatial data — without wasting compute on irrelevant modalities. This is specifically designed for real-world robots that need to see, understand, plan, and act, all in real time on limited hardware.

What makes HY-Embodied different from generic vision-language models is the focus on spatial-temporal perception and embodied reasoning. It doesn't just describe what it sees — it predicts what will happen, plans interactions, and reasons about physical constraints. The 32B variant hits frontier-level performance comparable to Gemini 3.0 Pro on embodied tasks.

The training approach is interesting too. They use a self-evolving post-training paradigm where the larger model teaches the smaller one through on-policy distillation. The result is a 2B model that punches way above its weight — small enough for a robot's onboard compute, smart enough to actually be useful.

343 upvotes on HuggingFace Papers. Weights, code, and training pipeline all open-sourced. If you're building anything in robotics or physical AI, this just became your baseline.

https://github.com/Tencent-Hunyuan/HY-Embodied
← Previous
Block Launches Managerbot — Square Gets a Proactive AI Store Manager
Next →
MolmoWeb: AI2 Open-Sources a Web Agent That Sees Like You Do
← Back to all articles

Comments

Loading...
>_