Self-Distilled Reasoner: On-Policy Self-Distillation for Large Language Models Paper • 2601.18734 • Published Jan 26 • 5
Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning Paper • 2508.19828 • Published Aug 27, 2025 • 8