Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes Paper • 2603.25562 • Published 20 days ago • 13
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 2 days ago • 60
Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing Paper • 2604.02288 • Published 14 days ago • 30
view article Article Seeing Isn’t Understanding: The Spatial Reasoning Gap in Vision-Language Models Jul 13, 2025 • 11