Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper โข 2508.13167 โข Published Aug 6, 2025 โข 129
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper โข 2505.17667 โข Published May 23, 2025 โข 88
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward Paper โข 2505.17018 โข Published May 22, 2025 โข 15
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Paper โข 2505.14684 โข Published May 20, 2025 โข 24
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper โข 2505.16938 โข Published May 22, 2025 โข 121
Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning Paper โข 2505.16410 โข Published May 22, 2025 โข 58
MMaDA: Multimodal Large Diffusion Language Models Paper โข 2505.15809 โข Published May 21, 2025 โข 98
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning Paper โข 2503.16252 โข Published Mar 20, 2025 โข 32