-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 6 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 3 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 61 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 41
Collections
Discover the best community collections!
Collections including paper arxiv:2603.00729
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 53 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 10 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 12 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 211 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 94 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 74 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 26
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Close the Loop: Synthesizing Infinite Tool-Use Data via Multi-Agent Role-Playing
Paper • 2512.23611 • Published • 6 -
Context as a Tool: Context Management for Long-Horizon SWE-Agents
Paper • 2512.22087 • Published • 3 -
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications
Paper • 2508.16279 • Published • 61 -
Very Large-Scale Multi-Agent Simulation in AgentScope
Paper • 2407.17789 • Published • 41
-
Beyond Language Modeling: An Exploration of Multimodal Pretraining
Paper • 2603.03276 • Published • 103 -
Qwen3-Coder-Next Technical Report
Paper • 2603.00729 • Published • 64 -
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use
Paper • 2603.03205 • Published • 13 -
AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios
Paper • 2602.23166 • Published • 45
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 21 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 53 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 211 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 94 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 74 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 26
-
The Debugging Decay Index: Rethinking Debugging Strategies for Code LLMs
Paper • 2506.18403 • Published • 3 -
ReCode: Updating Code API Knowledge with Reinforcement Learning
Paper • 2506.20495 • Published • 10 -
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Paper • 2507.23348 • Published • 12 -
LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Paper • 2509.09614 • Published • 7
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 24 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25