March 19, 2026Research RL Agents

Online Experiential Learning: Microsoft's Framework for Agents That Improve from Deployment

Microsoft Research has released Online Experiential Learning (OEL), a framework that enables language models to continuously improve from their own deployment experience. The paper appeared on HuggingFace Daily Papers with 35 upvotes, and code is available.

OEL works in two stages. First, transferable experiential knowledge is extracted and accumulated from interaction trajectories collected during real-world use. Second, this knowledge is consolidated into model parameters via on-policy context distillation — importantly, requiring no access to the user-side environment.

The results show consistent improvements over successive iterations, enhancing both task accuracy and token efficiency while preserving out-of-distribution performance. The key insight: extracted experiential knowledge is significantly more effective than raw trajectories, and on-policy consistency between the knowledge source and the policy model is critical for effective learning.

This addresses a fundamental challenge for deployed agents: how to get better over time without retraining on user data. Current agents are static after deployment — OEL provides a mechanism for agents to learn from what works and what doesn't in production, without compromising user privacy.

Paper: https://arxiv.org/abs/2603.16856
Code: https://aka.ms/oel-code

← Previous

Newton 1.0: Open-Source Physics Engine for Training Robotics Agents

Google Stitch 2.0: Vibe Design Tool with MCP Server for Coding Agents

← Back to all articles

Online Experiential Learning: Microsoft's Framework for Agents That Improve from Deployment

Related Articles

Comments