Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 27 days ago • 36
Running 202 Video Generation Leaderboard 📊 202 Text to Video and Image to Video Arena & Leaderboard
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion Paper • 2512.17504 • Published Dec 19, 2025 • 99
Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure Paper • 2512.14336 • Published Dec 16, 2025 • 32
EgoX: Egocentric Video Generation from a Single Exocentric Video Paper • 2512.08269 • Published Dec 9, 2025 • 123