Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.17612

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published Oct 8, 2025 • 49
Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Paper • 2509.13683 • Published Sep 17, 2025 • 8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering

Paper • 2509.00798 • Published Aug 31, 2025 • 1

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

LLMs Distillation

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Synthetic Data RL: Task Definition Is All You Need

Paper • 2505.17063 • Published May 18, 2025 • 11

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14, 2025 • 76
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15, 2025 • 120
Scaling Reasoning can Improve Factuality in Large Language Models

Paper • 2505.11140 • Published May 16, 2025 • 7

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

Paper • 2506.21506 • Published Jun 26, 2025 • 52
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20, 2025 • 44
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16, 2025 • 117

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

Smol Agents papers

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 56
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Paper • 2505.21600 • Published May 27, 2025 • 71
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28, 2025 • 46

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 82
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published May 23, 2025 • 112
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 513
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published Oct 8, 2025 • 49
Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Paper • 2509.13683 • Published Sep 17, 2025 • 8
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering

Paper • 2509.00798 • Published Aug 31, 2025 • 1

Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

Paper • 2506.21506 • Published Jun 26, 2025 • 52
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Efficient Agent Training for Computer Use

Paper • 2505.13909 • Published May 20, 2025 • 44
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16, 2025 • 117

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339
Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

LLMs Distillation

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

Smol Agents papers

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
Synthetic Data RL: Task Definition Is All You Need

Paper • 2505.17063 • Published May 18, 2025 • 11

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Paper • 2505.15045 • Published May 21, 2025 • 56
MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Paper • 2505.21600 • Published May 27, 2025 • 71
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Paper • 2505.22453 • Published May 28, 2025 • 46

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15, 2025 • 24
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14, 2025 • 76
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15, 2025 • 120
Scaling Reasoning can Improve Factuality in Large Language Models

Paper • 2505.11140 • Published May 16, 2025 • 7

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 82
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations

Paper • 2505.18125 • Published May 23, 2025 • 112
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Paper • 2505.17612 • Published May 23, 2025 • 81
One RL to See Them All: Visual Triple Unified Reinforcement Learning

Paper • 2505.18129 • Published May 23, 2025 • 62

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs