Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published Mar 8 • 86
CARE-Edit: Condition-Aware Routing of Experts for Contextual Image Editing Paper • 2603.08589 • Published Mar 9 • 38
WildActor: Unconstrained Identity-Preserving Video Generation Paper • 2603.00586 • Published Feb 28 • 38
VGGT-Det: Mining VGGT Internal Priors for Sensor-Geometry-Free Multi-View Indoor 3D Object Detection Paper • 2603.00912 • Published Mar 1 • 40
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published Dec 18, 2025 • 20
Are We Ready for RL in Text-to-3D Generation? A Progressive Investigation Paper • 2512.10949 • Published Dec 11, 2025 • 47
FlashVGGT: Efficient and Scalable Visual Geometry Transformers with Compressed Descriptor Attention Paper • 2512.01540 • Published Dec 1, 2025 • 5
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control Paper • 2511.18922 • Published Nov 24, 2025 • 13
Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos Paper • 2510.18489 • Published Oct 21, 2025 • 6
CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving Paper • 2510.07944 • Published Oct 9, 2025 • 25
Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction Paper • 2510.04759 • Published Oct 6, 2025 • 10
DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting Paper • 2507.00429 • Published Jul 1, 2025 • 1
HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis Paper • 2509.17083 • Published Sep 21, 2025 • 8
From One to More: Contextual Part Latents for 3D Generation Paper • 2507.08772 • Published Jul 11, 2025 • 26
Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning Paper • 2507.21049 • Published Jul 28, 2025 • 41
Taming LLMs by Scaling Learning Rates with Gradient Grouping Paper • 2506.01049 • Published Jun 1, 2025 • 39
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields Paper • 2505.02005 • Published May 4, 2025 • 3
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published Apr 3, 2025 • 52