Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.10910

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

Paper • 2506.15882 • Published Jun 18, 2025 • 2
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 136
The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20, 2025 • 85

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 44
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Paper • 2410.13639 • Published Oct 17, 2024 • 19
MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 32
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 61
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Paper • 2411.18478 • Published Nov 27, 2024 • 37

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5, 2025 • 135
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Paper • 2506.07240 • Published Jun 8, 2025 • 7
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11, 2025 • 55

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 196 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Papers + RL/Reasoning

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7, 2025 • 26
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11, 2025 • 33
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15, 2025 • 20

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 24
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 15
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8, 2025 • 211
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 47
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

Paper • 2507.06261 • Published Jul 7, 2025 • 67
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 274

Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

Paper • 2506.15882 • Published Jun 18, 2025 • 2
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19, 2025 • 136
The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20, 2025 • 85

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5, 2025 • 135
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Paper • 2506.07240 • Published Jun 8, 2025 • 7
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Paper • 2506.09991 • Published Jun 11, 2025 • 55

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 265
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62
Magistral

Paper • 2506.10910 • Published Jun 12, 2025 • 68

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 196 • 99
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 39
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 44
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

Papers + RL/Reasoning

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published Apr 7, 2025 • 26
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Paper • 2504.08600 • Published Apr 11, 2025 • 33
A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Paper • 2504.11343 • Published Apr 15, 2025 • 20

A Comparative Study on Reasoning Patterns of OpenAI's o1 Model

Paper • 2410.13639 • Published Oct 17, 2024 • 19
MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Paper • 2410.13757 • Published Oct 17, 2024 • 32
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 61
Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS

Paper • 2411.18478 • Published Nov 27, 2024 • 37

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 7
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 24
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 15
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs