DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published 1 day ago • 20
HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System Paper • 2604.14125 • Published 3 days ago • 16
RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework Paper • 2604.15308 • Published 1 day ago • 20
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 3 days ago • 57
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 5 days ago • 98
SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Paper • 2604.14144 • Published 3 days ago • 60
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 3 days ago • 129
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published 5 days ago • 56
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published 10 days ago • 107
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published 5 days ago • 133
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Paper • 2604.12627 • Published 4 days ago • 95
SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks Paper • 2604.08865 • Published 8 days ago • 28
Toward Autonomous Long-Horizon Engineering for ML Research Paper • 2604.13018 • Published 4 days ago • 30
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 4 days ago • 75
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 5 days ago • 68
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation Paper • 2604.10098 • Published 7 days ago • 74
The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published 5 days ago • 134
QuanBench+: A Unified Multi-Framework Benchmark for LLM-Based Quantum Code Generation Paper • 2604.08570 • Published 24 days ago • 121
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 10 days ago • 91