-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
Collections
Discover the best community collections!
Collections including paper arxiv:2603.03205
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 44 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 95 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper • 2602.08354 • Published • 263 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper • 2603.02083 • Published • 9
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 14 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Paper • 2510.00515 • Published • 42 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 148
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
Endless Terminals: Scaling RL Environments for Terminal Agents
Paper • 2601.16443 • Published • 18 -
Linear representations in language models can change dramatically over a conversation
Paper • 2601.20834 • Published • 21 -
Scaling Embeddings Outperforms Scaling Experts in Language Models
Paper • 2601.21204 • Published • 102 -
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability
Paper • 2601.18778 • Published • 42
-
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
KlingAvatar 2.0 Technical Report
Paper • 2512.13313 • Published • 44 -
SemanticGen: Video Generation in Semantic Space
Paper • 2512.20619 • Published • 95 -
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI
Paper • 2512.16676 • Published • 222
-
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs
Paper • 2510.11696 • Published • 182 -
Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Paper • 2602.08354 • Published • 263 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper • 2603.02083 • Published • 9
-
Universal Deep Research: Bring Your Own Model and Strategy
Paper • 2509.00244 • Published • 14 -
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey
Paper • 2509.02547 • Published • 238 -
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Paper • 2510.00515 • Published • 42 -
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
Paper • 2509.25454 • Published • 148