WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 7 days ago • 230
Repurposing Geometric Foundation Models for Multi-view Diffusion Paper • 2603.22275 • Published 23 days ago • 47
Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published Mar 3 • 103
DREAM: Where Visual Understanding Meets Text-to-Image Generation Paper • 2603.02667 • Published Mar 3 • 6
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published Mar 3 • 87