-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
Collections
Discover the best community collections!
Collections including paper arxiv:2509.07980
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 96 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
Paper • 2509.08755 • Published • 56 -
The Majority is not always right: RL training for solution aggregation
Paper • 2509.06870 • Published • 15 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
Paper • 2509.03646 • Published • 33
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 92 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106 -
How Far Are We from Genuinely Useful Deep Research Agents?
Paper • 2512.01948 • Published • 57
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 32 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 13 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 121
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 83 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 99 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 118 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 8
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning
Paper • 2505.17813 • Published • 58 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 90
-
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213 -
Why Language Models Hallucinate
Paper • 2509.04664 • Published • 199
-
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models
Paper • 2410.14059 • Published • 63 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
Token-Efficient Long Video Understanding for Multimodal LLMs
Paper • 2503.04130 • Published • 96 -
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing
Paper • 2503.10639 • Published • 53
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Tree Search for LLM Agent Reinforcement Learning
Paper • 2509.21240 • Published • 92 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106 -
How Far Are We from Genuinely Useful Deep Research Agents?
Paper • 2512.01948 • Published • 57
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 96 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 106
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 32 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 13 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 121
-
ExGRPO: Learning to Reason from Experience
Paper • 2510.02245 • Published • 83 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 99 -
rStar2-Agent: Agentic Reasoning Technical Report
Paper • 2508.20722 • Published • 118 -
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
Paper • 2508.19828 • Published • 8
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 254 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 263
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Don't Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoning
Paper • 2505.17813 • Published • 58 -
Deep Think with Confidence
Paper • 2508.15260 • Published • 90
-
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning
Paper • 2509.08755 • Published • 56 -
The Majority is not always right: RL training for solution aggregation
Paper • 2509.06870 • Published • 15 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Emergent Hierarchical Reasoning in LLMs through Reinforcement Learning
Paper • 2509.03646 • Published • 33
-
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 105 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213 -
Why Language Models Hallucinate
Paper • 2509.04664 • Published • 199