SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 14 days ago • 93
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection Paper • 2601.09195 • Published Jan 14 • 15
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published Jan 11 • 46