OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published 5 days ago • 68
view article Article `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` +9 Sep 16, 2025 • 51
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning Paper • 2603.23483 • Published 24 days ago • 62
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published 25 days ago • 33
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published about 1 month ago • 109
view article Article Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines +2 Mar 5 • 51
view article Article Make your ZeroGPU Spaces go brrr with ahead-of-time compilation +2 Sep 2, 2025 • 77
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images Paper • 2603.02210 • Published Mar 2 • 29
Helios Collection Helios: 14B Real-Time Long Video Generation Model can be Cheaper, Faster but Keep Stronger than 1.3B ones • 7 items • Updated Mar 15 • 24
FSVideo: Fast Speed Video Diffusion Model in a Highly-Compressed Latent Space Paper • 2602.02092 • Published Feb 2 • 18
iFSQ: Improving FSQ for Image Generation with 1 Line of Code Paper • 2601.17124 • Published Jan 23 • 33
Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models Paper • 2601.07287 • Published Jan 12 • 6
Plan-X: Instruct Video Generation via Semantic Planning Paper • 2511.17986 • Published Nov 22, 2025 • 18
FlashI2V: Fourier-Guided Latent Shifting Prevents Conditional Image Leakage in Image-to-Video Generation Paper • 2509.25187 • Published Sep 29, 2025 • 2
Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback Paper • 2510.16888 • Published Oct 19, 2025 • 22
BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration Paper • 2510.00438 • Published Oct 1, 2025 • 10
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 107