--- license: openrail language: - en pipeline_tag: image-to-image datasets: - mvp18/gscenes_pretrain tags: - diffusion - image-to-image - image-to-3d - 3d-reconstruction - gaussian-splatting - pose-free - sparse-view - rgbd base_model: - stabilityai/stable-diffusion-2 --- ## Summary This repository provides checkpoints used in the **Gaussian Scenes** pipeline for pose-free, sparse-view scene reconstruction. The weights are stored in Diffusers format and organized as two components: - **UNet** — denoising backbone (Diffusers UNet) adapted for our pipeline. - **VAE** — variational autoencoder used for latent encoding/decoding. These checkpoints are intended for research use and model reproducibility. ## Usage For a guide on how to use this model, check out the [official repository](https://github.com/gaussian-scenes). ## Citation If you use these checkpoints in your work, please cite the associated paper: ``` @article{ paul2025gaussian, title={Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors}, author={Soumava Paul and Prakhar Kaushik and Alan Yuille}, journal={Transactions on Machine Learning Research}, issn={2835-8856}, year={2025}, url={https://openreview.net/forum?id=yp1CYo6R0r}, note={} } ``` The HuggingFace paper page can be found [here](https://huggingface.co/papers/2411.15966).