image2image
layer-decomposition
boomcheng commited on
Commit
cd6b74b
·
verified ·
1 Parent(s): b434ba0

Upload RevealLayer model weights

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/demo1.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/framework.png filter=lfs diff=lfs merge=lfs -text
38
+ assets/pipeline.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,248 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ <div style="text-align: center;">
4
+ <img src="./assets/logo.png" alt="RevealLayer Logo" style="height: 96px;">
5
+ <h2>Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition</h2>
6
+ </div>
7
+
8
+ <div>
9
+ <strong>
10
+ Binhao Wang<sup>1,2,*</sup>,&nbsp;
11
+ Shihao Zhao<sup>1,2,*</sup>,&nbsp;
12
+ Bo Cheng<sup>2,*,†</sup>,&nbsp;
13
+ Qiuyu Ji<sup>1,2</sup>,&nbsp;
14
+ Yuhang Ma<sup>2</sup>,<br>
15
+ Liebucha Wu<sup>2</sup>,&nbsp;
16
+ Shanyuan Liu<sup>2</sup>,&nbsp;
17
+ Dawei Leng<sup>2,‡</sup>,&nbsp;
18
+ Yuhui Yin<sup>2</sup>
19
+ </strong>
20
+ </div>
21
+
22
+ <div>
23
+ <sup>1</sup>Wenzhou University&nbsp;&nbsp;&nbsp;
24
+ <sup>2</sup>360 AI Research
25
+ </div>
26
+
27
+ <div>
28
+ <sup>*</sup> Equal Contribution. &nbsp;
29
+ <sup>†</sup> Project Lead. &nbsp;
30
+ <sup>‡</sup> Corresponding Author.
31
+ </div>
32
+
33
+ <br>
34
+
35
+ <div>
36
+ <a href="https://zhao0100.github.io/RevealLayer/" target="_blank">
37
+ <img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages">
38
+ </a>
39
+ &ensp;
40
+ <a href="TODO_ARXIV_LINK" target="_blank">
41
+ <img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv">
42
+ </a>
43
+ &ensp;
44
+ <a href="TODO_DATASET_LINK" target="_blank">
45
+ <img src="https://img.shields.io/static/v1?label=Dataset&message=RevealLayer&color=green">
46
+ </a>
47
+ &ensp;
48
+ <a href="TODO_MODEL_LINK" target="_blank">
49
+ <img src="https://img.shields.io/static/v1?label=Model&message=HuggingFace&color=yellow">
50
+ </a>
51
+ </div>
52
+
53
+ <br>
54
+
55
+ <strong>
56
+ RevealLayer decomposes an RGB image into multiple RGBA layers, enabling precise layer separation and reliable recovery of occluded content in natural scenes.
57
+ </strong>
58
+
59
+ <br><br>
60
+
61
+ <div style="width: 100%; text-align: center; margin: auto;">
62
+ <img style="width:100%" src="assets/demo1.png" alt="RevealLayer teaser">
63
+ </div>
64
+
65
+ For more visual results, go checkout our <a href="https://zhao0100.github.io/RevealLayer/" target="_blank">project page</a>.
66
+
67
+ ---
68
+
69
+ </div>
70
+
71
+ ## ⭐ Update
72
+
73
+ - **[Coming Soon]** We will release the RevealLayer checkpoint and datasets.
74
+ - **[Coming Soon]** We will release the paper and inference code.
75
+
76
+ ### ✅ TODO
77
+
78
+ - [ ] Release models and datasets.
79
+ - [ ] Release inference code and demo examples.
80
+
81
+ ---
82
+
83
+ ## 🎃 Overview
84
+
85
+ RevealLayer focuses on occlusion-aware image layer decomposition, recovering visible and hidden RGBA layers from a single RGB image with region guidance.
86
+
87
+ <div style="width: 100%; text-align: center; margin: auto;">
88
+ <img style="width:100%" src="assets/framework.png" alt="RevealLayer framework">
89
+ </div>
90
+
91
+ ---
92
+
93
+ ## 📷 Datasets
94
+
95
+ <div style="width: 100%; text-align: center; margin: auto;">
96
+ <img style="width:100%" src="assets/pipeline.png" alt="RevealLayer dataset pipeline">
97
+ </div>
98
+
99
+ We construct a large-scale multi-layer image decomposition dataset, including **RevealLayer-100K** for training and **RevealLayerBench** for evaluation. RevealLayer-100K contains 100K multi-layer natural image tuples with RGB images, background layers, RGBA foreground layers, and bounding boxes. RevealLayerBench contains 200 high-quality manually curated images, covering challenging cases such as complex occlusions, large-area objects, transparent materials, small foreground objects, and multi-layer scenes.
100
+
101
+ 🔥 We will release **RevealLayer-100K** and **RevealLayerBench** on [Hugging Face](TODO_DATASET_LINK). We hope they can serve as useful training and evaluation resources for future research on occlusion-aware image layer decomposition.
102
+
103
+ > 🚩 The datasets are intended for research use. Please follow the license and terms provided with the released dataset.
104
+
105
+ ---
106
+
107
+ ## 🔧 Quick Start
108
+
109
+ ### 0. Experimental environment
110
+
111
+ We tested our inference code with Python 3.10 and CUDA GPUs.
112
+
113
+ ### 1. Setup repository and environment
114
+
115
+ ```bash
116
+ git clone https://github.com/Zhao0100/RevealLayer.git
117
+ cd RevealLayer
118
+
119
+ conda create -n reveallayer python=3.10
120
+ conda activate reveallayer
121
+
122
+ pip install -r requirements.txt
123
+
124
+ pip install flash-attn --no-build-isolation
125
+
126
+ cd diffusers
127
+ pip install .
128
+ cd ..
129
+ ```
130
+
131
+ ---
132
+
133
+ ## 📦 Prepare the models
134
+
135
+ Model files are hosted with Git LFS, so please enable Git LFS before cloning model repositories.
136
+
137
+ ```bash
138
+ git lfs install
139
+ ```
140
+
141
+ Download the RevealLayer checkpoint:
142
+
143
+ ```bash
144
+ git clone https://huggingface.co/qihoo360/RevealLayer models/RevealLayer
145
+ ```
146
+
147
+ Download FLUX.1-dev:
148
+
149
+ ```bash
150
+ git clone https://huggingface.co/black-forest-labs/FLUX.1-dev models/FLUX.1-dev
151
+ ```
152
+
153
+ The expected model directory structure is:
154
+
155
+ ```text
156
+ models
157
+ ├── RevealLayer
158
+ │ ├── pytorch_lora_weights.safetensors
159
+ │ ├── layer_pe.pt
160
+ │ ├── Refiner.pt
161
+ │ ├── xvae
162
+ │ │ └── transparent_decoder_ckpt.pth
163
+ │ └── ...
164
+ ├── FLUX.1-dev
165
+ │ ├── transformer
166
+ │ ├── vae
167
+ │ ├── text_encoder
168
+ │ ├── text_encoder_2
169
+ │ ├── tokenizer
170
+ │ ├── tokenizer_2
171
+ │ └── ...
172
+ ```
173
+
174
+ If your local model directory is different, please modify the corresponding paths in the inference script.
175
+
176
+ ---
177
+
178
+ ## 🗂️ Prepare input JSON
179
+
180
+ The input JSON should contain a list of samples. Each sample should include the input image path and detected bounding boxes.
181
+
182
+ Example:
183
+
184
+ ```json
185
+ [
186
+ {
187
+ "imgid": "examples",
188
+ "full_image": "RevealLayer-Bench/examples/full_image.png",
189
+ "background": "RevealLayer-Bench/examples/background.png",
190
+ "LayerInfoRaw": [
191
+ "RevealLayer-Bench/examples/layer_0.png",
192
+ "RevealLayer-Bench/examples/layer_1.png"
193
+ ],
194
+ "detections": [
195
+ {
196
+ "bbox": [x1, y1, x2, y2]
197
+ },
198
+ {
199
+ "bbox": [x1, y1, x2, y2]
200
+ }
201
+ ]
202
+ }
203
+ ]
204
+ ```
205
+
206
+ The expected fields are:
207
+
208
+ ```text
209
+ imgid : sample id
210
+ full_image : path to the input RGB image
211
+ background : path to the background image, optional for inference
212
+ LayerInfoRaw : paths to the ground-truth RGBA layers, optional for inference
213
+ detections : detected foreground objects
214
+ bbox : bounding box in [x1, y1, x2, y2] format
215
+ ```
216
+
217
+ ---
218
+
219
+ ## ⚡ Inference
220
+
221
+ Run inference with:
222
+
223
+ ```bash
224
+ bash infer.sh 0
225
+ ```
226
+
227
+ Before running, please make sure the paths in `infer.sh` and `infer_new.py` match your local model and data directories.
228
+
229
+ ---
230
+
231
+ ## 📑 Citation
232
+
233
+ If you find our work useful for your research, please consider citing:
234
+
235
+ ```bibtex
236
+ @inproceedings{wang2026reveallayer,
237
+ title={RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition},
238
+ author={Wang, Binhao and Zhao, Shihao and Cheng, Bo and Ji, Qiuyu and Ma, Yuhang and Wu, Liebucha and Liu, Shanyuan and Leng, Dawei and Yin, Yuhui},
239
+ booktitle={International Conference on Machine Learning},
240
+ year={2026}
241
+ }
242
+ ```
243
+
244
+ ---
245
+
246
+ ## 📝 License
247
+
248
+ This project is licensed under the [Apache License 2.0](LICENSE).
Refiner.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:752aff6cb62a44e35098b7e21eabf93744eb204e4ebb6d9615e3d1daffe40d3c
3
+ size 56997104
assets/demo1.png ADDED

Git LFS Details

  • SHA256: 048e4ca6a2855870254286709ea832bb1add51280e1ead74827194cd948e444f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.01 MB
assets/framework.png ADDED

Git LFS Details

  • SHA256: 84598f31f2d1f2fcba280dd0b721fa5078d634bd69f96a4e813b4a66c2d6d438
  • Pointer size: 131 Bytes
  • Size of remote file: 390 kB
assets/logo.png ADDED
assets/pipeline.png ADDED

Git LFS Details

  • SHA256: 3cba52c7d2e1722b991a52fea98179149b329d7ba4e3273fd49a30689c7ac08f
  • Pointer size: 131 Bytes
  • Size of remote file: 210 kB
layer_pe.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c3144b547bf8dad8246ee0070c4ea37b7d12cf5aa54f0aacae45ba3cc12cf6f0
3
+ size 75312
pytorch_lora_weights.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f5069b954306f882d384c059fbc07b931d2d4d5c4d6d4d16e8d888f8c512bc7c
3
+ size 298933416
xvae/transparent_decoder_ckpt.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:44653f514096dedf906354310d21ae9a62c812d79ac69f8dc4e6d7b8575ee8c3
3
+ size 341128512