Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2602.11964

Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments

Paper • 2602.11964 • Published Feb 12 • 13

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

Reinforcement learning

about 9 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published Jan 29 • 62
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 110
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published Oct 22, 2025 • 63

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 45
facebook/natural_reasoning

Viewer • Updated Feb 21, 2025 • 1.15M • 1.46k • 561
nvidia/OpenMathReasoning

Viewer • Updated May 27, 2025 • 5.68M • 17.6k • 453
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5, 2025 • 18

Gaia2: Benchmarking LLM Agents on Dynamic and Asynchronous Environments

Paper • 2602.11964 • Published Feb 12 • 13

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published Jan 29 • 62
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 110
Reinforced Attention Learning

Paper • 2602.04884 • Published Feb 4 • 29
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published Oct 22, 2025 • 63

Endless Terminals: Scaling RL Environments for Terminal Agents

Paper • 2601.16443 • Published Jan 23 • 18
Linear representations in language models can change dramatically over a conversation

Paper • 2601.20834 • Published Jan 28 • 21
Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 102
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability

Paper • 2601.18778 • Published Jan 26 • 42

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Paper • 2505.13227 • Published May 19, 2025 • 45
facebook/natural_reasoning

Viewer • Updated Feb 21, 2025 • 1.15M • 1.46k • 561
nvidia/OpenMathReasoning

Viewer • Updated May 27, 2025 • 5.68M • 17.6k • 453
Search Arena: Analyzing Search-Augmented LLMs

Paper • 2506.05334 • Published Jun 5, 2025 • 18

Reinforcement learning

about 9 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4, 2025 • 104
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs