Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm Paper • 2602.11661 • Published Feb 12 • 2
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 195
Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making Paper • 2602.06570 • Published Feb 6 • 61
RE-TRAC: REcursive TRAjectory Compression for Deep Search Agents Paper • 2602.02486 • Published Feb 2 • 20
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published Dec 31, 2025 • 108
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 154
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 195
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management Paper • 2510.06727 • Published Oct 8, 2025 • 5
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12, 2025 • 215
Reinforcement Learning Foundations for Deep Research Systems: A Survey Paper • 2509.06733 • Published Sep 8, 2025 • 32
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published Oct 24, 2025 • 103
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search Paper • 2510.12801 • Published Oct 14, 2025 • 13
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents Paper • 2509.06501 • Published Sep 8, 2025 • 82