-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2505.17612
-
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 339 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191
-
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Paper • 2505.10320 • Published • 24 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 76 -
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models
Paper • 2505.10554 • Published • 120 -
Scaling Reasoning can Improve Factuality in Large Language Models
Paper • 2505.11140 • Published • 7
-
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Paper • 2506.21506 • Published • 52 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Efficient Agent Training for Computer Use
Paper • 2505.13909 • Published • 44 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 56 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 98 -
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper • 2505.21600 • Published • 71 -
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
Paper • 2505.22453 • Published • 46
-
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Paper • 2505.02567 • Published • 82 -
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations
Paper • 2505.18125 • Published • 112 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 62
-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 513 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 49 -
Improving Context Fidelity via Native Retrieval-Augmented Reasoning
Paper • 2509.13683 • Published • 8 -
Multimodal Iterative RAG for Knowledge-Intensive Visual Question Answering
Paper • 2509.00798 • Published • 1
-
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge
Paper • 2506.21506 • Published • 52 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Efficient Agent Training for Computer Use
Paper • 2505.13909 • Published • 44 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
Qwen3 Technical Report
Paper • 2505.09388 • Published • 339 -
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Paper • 2505.03335 • Published • 191
-
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective
Paper • 2505.15045 • Published • 56 -
MMaDA: Multimodal Large Diffusion Language Models
Paper • 2505.15809 • Published • 98 -
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing
Paper • 2505.21600 • Published • 71 -
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO
Paper • 2505.22453 • Published • 46
-
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
Paper • 2505.10320 • Published • 24 -
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
Paper • 2505.09343 • Published • 76 -
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models
Paper • 2505.10554 • Published • 120 -
Scaling Reasoning can Improve Factuality in Large Language Models
Paper • 2505.11140 • Published • 7
-
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Paper • 2505.02567 • Published • 82 -
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations
Paper • 2505.18125 • Published • 112 -
Distilling LLM Agent into Small Models with Retrieval and Code Tools
Paper • 2505.17612 • Published • 81 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 62