MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published about 1 month ago • 185
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model Paper • 2510.18855 • Published Oct 21, 2025 • 73
GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning Paper • 2509.17437 • Published Sep 22, 2025 • 17
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs Paper • 2506.14731 • Published Jun 17, 2025 • 8