Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora Paper • 2604.24819 • Published 10 days ago • 86
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 627
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 28 days ago • 261