ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models Paper • 2405.13729 • Published 8 days ago • 10
Map2World: Segment Map Conditioned Text to 3D World Generation Paper • 2605.00781 • Published 6 days ago • 24
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 28 days ago • 245
VLS: Steering Pretrained Robot Policies via Vision-Language Models Paper • 2602.03973 • Published Feb 3 • 22
HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published Jan 15 • 48
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published Nov 27, 2025 • 245
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14, 2025 • 157
view article Article Introducing Waypoint-1: Real-time interactive video diffusion from Overworld +3 Jan 20 • 43