Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 10 days ago • 114
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents Paper • 2604.02947 • Published 14 days ago • 19
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization Paper • 2604.02268 • Published 15 days ago • 93
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published 21 days ago • 353
UniDriveVLA: Unifying Understanding, Perception, and Action Planning for Autonomous Driving Paper • 2604.02190 • Published 15 days ago • 28
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward Paper • 2603.26599 • Published 21 days ago • 63
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 18 days ago • 57
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published 22 days ago • 155
Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale Paper • 2603.25040 • Published 22 days ago • 131
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation Paper • 2603.19039 • Published 29 days ago • 50
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Paper • 2603.24533 • Published 23 days ago • 47
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 29 days ago • 66
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 29 days ago • 58
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published about 1 month ago • 138
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published Mar 16 • 185
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning Paper • 2603.10160 • Published Mar 10 • 26