Collections
Discover the best community collections!
Collections including paper arxiv:2602.11964
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 110 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 29 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.46k • 561 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 17.6k • 453 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 110 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 29 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis
Paper • 2505.13227 • Published • 45 -
facebook/natural_reasoning
Viewer • Updated • 1.15M • 1.46k • 561 -
nvidia/OpenMathReasoning
Viewer • Updated • 5.68M • 17.6k • 453 -
Search Arena: Analyzing Search-Augmented LLMs
Paper • 2506.05334 • Published • 18
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75