AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents Paper • 2603.14465 • Published Mar 15 • 23
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published Mar 3 • 32
Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning Paper • 2601.04726 • Published Jan 8 • 9
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning Paper • 2603.03379 • Published Mar 3 • 32
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published Feb 11 • 59
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published Feb 11 • 59
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published Feb 12 • 34
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published Feb 12 • 34
Budget-Constrained Agentic Large Language Models: Intention-Based Planning for Costly Tool Use Paper • 2602.11541 • Published Feb 12 • 2
ASA: Training-Free Representation Engineering for Tool-Calling Agents Paper • 2602.04935 • Published Feb 4 • 42
ASA: Training-Free Representation Engineering for Tool-Calling Agents Paper • 2602.04935 • Published Feb 4 • 42
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents Paper • 2602.07035 • Published Feb 3 • 30
MemMamba: Rethinking Memory Patterns in State Space Model Paper • 2510.03279 • Published Sep 28, 2025 • 74
GISA: A Benchmark for General Information-Seeking Assistant Paper • 2602.08543 • Published Feb 9 • 26
Deep Search with Hierarchical Meta-Cognitive Monitoring Inspired by Cognitive Neuroscience Paper • 2601.23188 • Published Jan 30 • 9
DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution Paper • 2601.13761 • Published Jan 20 • 16
When Personalization Misleads: Understanding and Mitigating Hallucinations in Personalized LLMs Paper • 2601.11000 • Published Jan 16 • 27
MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching Paper • 2601.10712 • Published Jan 15 • 24
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration Paper • 2601.06860 • Published Jan 11 • 16