Submitted by taesiri 233 Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation · 25 authors 741 7
Submitted by taesiri 78 Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks · 11 authors 1 4
Submitted by ambud26 59 What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity AI at Meta 3
Submitted by taesiri 44 VisPlay: Self-Evolving Vision-Language Models from Images University of Illinois at Urbana-Champaign 53 3
Submitted by hangyulmd 26 Instruction-Guided Lesion Segmentation for Chest X-rays with Automatically Generated Large-Scale Dataset KAIST AI 2 1
Submitted by Jevin754 18 ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries ARC Lab, Tencent PCG 37 2
Submitted by Hao-Zhe 10 Mixture of States: Routing Token-Level Dynamics for Multimodal Generation AI at Meta 2
Submitted by doraemonILoveYou 7 FreeAskWorld: An Interactive and Closed-Loop Simulator for Human-Centric Embodied AI · 9 authors 219 2
Submitted by dorienh 3 Aligning Generative Music AI with Human Preferences: Methods and Challenges · 2 authors 2