Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective Paper • 2509.22613 • Published Sep 26, 2025 • 10
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always! Paper • 2509.26495 • Published Sep 30, 2025 • 13
Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention Paper • 2509.23610 • Published Sep 28, 2025 • 15
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications Paper • 2509.26490 • Published Sep 30, 2025 • 20
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30, 2025 • 34
Thinking Sparks!: Emergent Attention Heads in Reasoning Models During Post Training Paper • 2509.25758 • Published Sep 30, 2025 • 23
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning Paper • 2509.23873 • Published Sep 28, 2025 • 68