WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 23 days ago • 91
Grounding World Simulation Models in a Real-World Metropolis Paper • 2603.15583 • Published Mar 16 • 153
Mode Seeking meets Mean Seeking for Fast Long Video Generation Paper • 2602.24289 • Published Feb 27 • 41
Reliable and Responsible Foundation Models: A Comprehensive Survey Paper • 2602.08145 • Published Feb 4 • 8
Inference-time Physics Alignment of Video Generative Models with Latent World Models Paper • 2601.10553 • Published Jan 15 • 13
EasyV2V: A High-quality Instruction-based Video Editing Framework Paper • 2512.16920 • Published Dec 18, 2025 • 18
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory Paper • 2512.07802 • Published Dec 8, 2025 • 46
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published Dec 5, 2025 • 29
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published Dec 5, 2025 • 29
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published Dec 5, 2025 • 29 • 2
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 170
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training Paper • 2509.26625 • Published Sep 30, 2025 • 43
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning Paper • 2509.07980 • Published Sep 9, 2025 • 105
Paused Agents Featured 823 Qwen Image Edit ✒ 823 Edit and enhance images based on descriptive instructions
facebook/dinov3-vitb16-pretrain-lvd1689m Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 1.57M • 116