March 18, 2026Research Open Source Agents

OpenSeeker: First Fully Open-Source Search Agent with Complete Training Data

OpenSeeker is the first academic project to achieve state-of-the-art performance on frontier search benchmarks while fully open-sourcing the entire training pipeline and data. The paper has 104 upvotes on HuggingFace Daily Papers.

Using only 11.7K training examples, the team fine-tuned Qwen3-30B-A3B-Thinking and achieved results that surpass industrial competitors — including Tongyi DeepResearch on BrowseComp-ZH (48.4% vs 46.7%). The project open-sources the complete synthesis pipeline, high-fidelity training data, and model weights.

OpenSeeker addresses a critical gap in the search agent space: while commercial search agents (like Perplexity, Google Deep Research, and Tongyi DeepResearch) have achieved strong results, their training data and methods remain proprietary. OpenSeeker democratizes this capability by providing the full recipe.

For the agentic ecosystem, this is significant because search is one of the most fundamental agent capabilities. Open-sourcing the complete training pipeline means any team can now build competitive search agents without proprietary data.

GitHub: https://github.com/rui-ye/OpenSeeker
Paper: https://arxiv.org/abs/2603.15594

← Previous

Attention Residuals: Moonshot AI's Drop-In Transformer Architecture Upgrade with Open-Source Code

GLM-5-Turbo: First LLM Built from Scratch for AI Agent Workflows

← Back to all articles

OpenSeeker: First Fully Open-Source Search Agent with Complete Training Data

More Articles

Comments