Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning Paper • 2509.03646 • Published Sep 3, 2025 • 33
Reverse-Engineered Reasoning for Open-Ended Generation Paper • 2509.06160 • Published Sep 7, 2025 • 151
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26, 2025 • 26
Dr. Bench: A Multidimensional Evaluation for Deep Research Agents, from Answers to Reports Paper • 2510.02190 • Published Jan 29 • 19
Infinity Parser: Layout Aware Reinforcement Learning for Scanned Document Parsing Paper • 2510.15349 • Published Oct 17, 2025
From Illusion to Intention: Visual Rationale Learning for Vision-Language Reasoning Paper • 2511.23031 • Published Nov 28, 2025 • 1
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published Jan 22 • 92
Understanding by Reconstruction: Reversing the Software Development Process for LLM Pretraining Paper • 2603.11103 • Published Mar 11 • 9
SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Understanding Paper • 2603.16124 • Published Mar 17 • 3
LongCat-Next: Lexicalizing Modalities as Discrete Tokens Paper • 2603.27538 • Published 20 days ago • 143
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 5 days ago • 99
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 5 days ago • 99
MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies Paper • 2603.24649 • Published 24 days ago • 31
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1, 2025 • 81
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published May 21, 2025 • 53
StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs Paper • 2505.20139 • Published May 26, 2025 • 19
Beyond Distillation: Pushing the Limits of Medical LLM Reasoning with Minimalist Rule-Based RL Paper • 2505.17952 • Published May 23, 2025 • 20
Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models Paper • 2505.19509 • Published May 26, 2025 • 7