alexbuburuzan
/

MObI

autonomous-driving

Model card Files Files and versions

MObI / README.md

alexbuburuzan's picture

Upload README.md

cb5dc6b verified 1 day ago

|

history blame contribute delete

2.51 kB

	---
	license: cc-by-nc-4.0
	tags:
	- diffusion
	- inpainting
	- multimodal
	- autonomous-driving
	- nuscenes
	---

	# 🐳 MObI: Multimodal Object Inpainting Using Diffusion Models

	Pretrained weights for MObI, a diffusion-based model for joint multimodal object inpainting across camera and lidar, conditioned on a single reference image and a 3D bounding box.

	📄 Paper: [arXiv:2501.03173](https://arxiv.org/abs/2501.03173)
	💻 Code: [github.com/alexbuburuzan/MObI](https://github.com/alexbuburuzan/MObI)
	Venue: CVPR Workshop on Data-Driven Autonomous Driving Simulation (DDADS), 2025

	## Overview

	MObI extends [Paint-by-Example](https://github.com/Fantasy-Studio/Paint-by-Example) to:
	- Jointly inpaint RGB camera, lidar depth, and lidar intensity
	- Insert objects from a single reference image
	- Use 3D bounding box conditioning for accurate spatial placement

	This combines the realism of reference-based inpainting with the controllability of 3D-aware methods.

	## Contents

	\| File \| Description \|
	\|------\|-------------\|
	\| `mobi_nuscenes_epoch28.ckpt` \| MObI trained on nuScenes \|
	\| `autoencoders/range_autoencoder.ckpt` \| Range-view VAE for lidar \|

	## Results (nuScenes)

	\| Reference Type \| FID ↓ \| LPIPS ↓ \| CLIP ↑ \| D-LPIPS ↓ \| I-LPIPS ↓ \|
	\|---------------\|-------\|---------\|--------\|-----------\|-----------\|
	\| id-ref \| 6.503 \| 0.114 \| 84.9 \| 0.130 \| 0.147 \|
	\| track-ref \| 6.703 \| 0.115 \| 83.5 \| 0.129 \| 0.149 \|
	\| in-domain-ref \| 8.947 \| 0.127 \| 77.5 \| 0.132 \| 0.154 \|
	\| cross-domain-ref \| 9.046 \| 0.130 \| 76.0 \| 0.132 \| 0.153 \|

	## Usage

	See the [GitHub repository](https://github.com/alexbuburuzan/MObI) for installation, data preprocessing, inference, and training instructions.

	```bash
	git clone https://github.com/alexbuburuzan/MObI.git
	cd MObI
	bash scripts/download_models.sh
	bash scripts/realism_test_bench.sh
	```

	## Citation

	```bibtex
	@InProceedings{Buburuzan_2025_CVPR,
	author = {Buburuzan, Alexandru and Sharma, Anuj and Redford, John and Dokania, Puneet K. and Mueller, Romain},
	title = {MObI: Multimodal Object Inpainting Using Diffusion Models},
	booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
	month = {June},
	year = {2025},
	pages = {1999-2009}
	}
	```

	## License

	Released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). Note that this work builds on Paint-by-Example and BEVFusion, which have their own licenses.