Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 8 days ago • 310
ATBench: A Diverse and Realistic Trajectory Benchmark for Long-Horizon Agent Safety Paper • 2604.02022 • Published 14 days ago • 15
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security Paper • 2601.18491 • Published Jan 26 • 125
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 Paper • 2602.14457 • Published Feb 16 • 29
Geometrically-Constrained Agent for Spatial Reasoning Paper • 2511.22659 • Published Nov 27, 2025 • 41