Collections
Discover the best community collections!
Collections including paper arxiv:2509.19296
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 74 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 65 -
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
Paper • 2509.18090 • Published • 5 -
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Paper • 2509.19296 • Published • 29
-
PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos
Paper • 2509.25183 • Published • 3 -
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams
Paper • 2506.08862 • Published • 6 -
VoluMe -- Authentic 3D Video Calls from Live Gaussian Splat Prediction
Paper • 2507.21311 • Published • 1 -
SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Paper • 2512.10860 • Published • 1
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 154 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos
Paper • 2509.25183 • Published • 3 -
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video Streams
Paper • 2506.08862 • Published • 6 -
VoluMe -- Authentic 3D Video Calls from Live Gaussian Splat Prediction
Paper • 2507.21311 • Published • 1 -
SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Paper • 2512.10860 • Published • 1
-
Test-Time Scaling with Reflective Generative Model
Paper • 2507.01951 • Published • 108 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 154 -
Autoregressive Diffusion Models
Paper • 2110.02037 • Published -
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
Paper • 2502.09509 • Published • 9
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 30 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 15 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 11 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 14 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 22 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 9
-
FlashWorld: High-quality 3D Scene Generation within Seconds
Paper • 2510.13678 • Published • 74 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 65 -
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction
Paper • 2509.18090 • Published • 5 -
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
Paper • 2509.19296 • Published • 29