Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5, 2025 • 140 • 8
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23, 2024 • 77 • 7
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published Feb 12 • 57 • 5
Is Artificial Intelligence Generated Image Detection a Solved Problem? Paper • 2505.12335 • Published May 18, 2025 • 2
Everything is Context: Agentic File System Abstraction for Context Engineering Paper • 2512.05470 • Published Dec 5, 2025 • 1 • 2
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 5 days ago • 263 • 7
When Numbers Speak: Aligning Textual Numerals and Visual Instances in Text-to-Video Diffusion Models Paper • 2604.08546 • Published 4 days ago • 108 • 4
ClawBench: Can AI Agents Complete Everyday Online Tasks? Paper • 2604.08523 • Published 4 days ago • 240 • 5
Embarrassingly Simple Self-Distillation Improves Code Generation Paper • 2604.01193 • Published 11 days ago • 34 • 6
GBQA: A Game Benchmark for Evaluating LLMs as Quality Assurance Engineers Paper • 2604.02648 • Published 10 days ago • 43 • 3
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 4 days ago • 256 • 6
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 5 days ago • 151 • 4
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 7 days ago • 227 • 8
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published 11 days ago • 459 • 6
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 6 days ago • 110 • 5