alexbuburuzan commited on
Commit
cb5dc6b
·
verified ·
1 Parent(s): c1ccb3b

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -3
README.md CHANGED
@@ -1,3 +1,70 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ tags:
4
+ - diffusion
5
+ - inpainting
6
+ - multimodal
7
+ - autonomous-driving
8
+ - nuscenes
9
+ ---
10
+
11
+ # 🐳 MObI: Multimodal Object Inpainting Using Diffusion Models
12
+
13
+ Pretrained weights for **MObI**, a diffusion-based model for joint multimodal object inpainting across camera and lidar, conditioned on a single reference image and a 3D bounding box.
14
+
15
+ 📄 **Paper:** [arXiv:2501.03173](https://arxiv.org/abs/2501.03173)
16
+ 💻 **Code:** [github.com/alexbuburuzan/MObI](https://github.com/alexbuburuzan/MObI)
17
+ **Venue:** CVPR Workshop on Data-Driven Autonomous Driving Simulation (DDADS), 2025
18
+
19
+ ## Overview
20
+
21
+ MObI extends [Paint-by-Example](https://github.com/Fantasy-Studio/Paint-by-Example) to:
22
+ - Jointly inpaint **RGB camera, lidar depth, and lidar intensity**
23
+ - Insert objects from a **single reference image**
24
+ - Use **3D bounding box conditioning** for accurate spatial placement
25
+
26
+ This combines the realism of reference-based inpainting with the controllability of 3D-aware methods.
27
+
28
+ ## Contents
29
+
30
+ | File | Description |
31
+ |------|-------------|
32
+ | `mobi_nuscenes_epoch28.ckpt` | MObI trained on nuScenes |
33
+ | `autoencoders/range_autoencoder.ckpt` | Range-view VAE for lidar |
34
+
35
+ ## Results (nuScenes)
36
+
37
+ | Reference Type | FID ↓ | LPIPS ↓ | CLIP ↑ | D-LPIPS ↓ | I-LPIPS ↓ |
38
+ |---------------|-------|---------|--------|-----------|-----------|
39
+ | id-ref | 6.503 | 0.114 | 84.9 | 0.130 | 0.147 |
40
+ | track-ref | 6.703 | 0.115 | 83.5 | 0.129 | 0.149 |
41
+ | in-domain-ref | 8.947 | 0.127 | 77.5 | 0.132 | 0.154 |
42
+ | cross-domain-ref | 9.046 | 0.130 | 76.0 | 0.132 | 0.153 |
43
+
44
+ ## Usage
45
+
46
+ See the [GitHub repository](https://github.com/alexbuburuzan/MObI) for installation, data preprocessing, inference, and training instructions.
47
+
48
+ ```bash
49
+ git clone https://github.com/alexbuburuzan/MObI.git
50
+ cd MObI
51
+ bash scripts/download_models.sh
52
+ bash scripts/realism_test_bench.sh
53
+ ```
54
+
55
+ ## Citation
56
+
57
+ ```bibtex
58
+ @InProceedings{Buburuzan_2025_CVPR,
59
+ author = {Buburuzan, Alexandru and Sharma, Anuj and Redford, John and Dokania, Puneet K. and Mueller, Romain},
60
+ title = {MObI: Multimodal Object Inpainting Using Diffusion Models},
61
+ booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
62
+ month = {June},
63
+ year = {2025},
64
+ pages = {1999-2009}
65
+ }
66
+ ```
67
+
68
+ ## License
69
+
70
+ Released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). Note that this work builds on Paint-by-Example and BEVFusion, which have their own licenses.