FileGram: Grounding Agent Personalization in File-System Behavioral Traces Paper • 2604.04901 • Published 10 days ago • 40
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer Paper • 2603.19227 • Published 28 days ago • 42
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions Paper • 2603.15612 • Published about 1 month ago • 152
ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors Paper • 2603.04338 • Published Mar 4 • 24
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding? Paper • 2603.03241 • Published Mar 3 • 87
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 67
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 76
SenseNova-SI Collection Scaling Spatial Intelligence with Multimodal Foundation Models • 14 items • Updated about 3 hours ago • 16
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17, 2025 • 48
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published Nov 17, 2025 • 53
Phased DMD: Few-step Distribution Matching Distillation via Score Matching within Subintervals Paper • 2510.27684 • Published Oct 31, 2025 • 23
The Quest for Generalizable Motion Generation: Data, Model, and Evaluation Paper • 2510.26794 • Published Oct 30, 2025 • 27
Has GPT-5 Achieved Spatial Intelligence? An Empirical Study Paper • 2508.13142 • Published Aug 18, 2025 • 34
4DNeX: Feed-Forward 4D Generative Modeling Made Easy Paper • 2508.13154 • Published Aug 18, 2025 • 62