Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 6 days ago • 306
AgentHallu: Benchmarking Automated Hallucination Attribution of LLM-based Agents Paper • 2601.06818 • Published Jan 11 • 1
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published Jan 26 • 125
Controlled Self-Evolution for Algorithmic Code Optimization Paper • 2601.07348 • Published Jan 12 • 116