MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 25 days ago • 135
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 24 days ago • 91
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published 23 days ago • 28
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published about 1 month ago • 138
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published about 1 month ago • 308
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published about 1 month ago • 60
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published Mar 16 • 153
Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion Paper • 2603.06577 • Published Mar 6 • 49
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published Feb 27 • 98
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens Paper • 2603.02138 • Published Mar 2 • 151
EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents Paper • 2602.23205 • Published Feb 26 • 11
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference Paper • 2602.21548 • Published Feb 25 • 50