Upload RevealLayer model weights

Browse files

Files changed (10) hide show

.gitattributes +3 -0
README.md +248 -0
Refiner.pt +3 -0
assets/demo1.png +3 -0
assets/framework.png +3 -0
assets/logo.png +0 -0
assets/pipeline.png +3 -0
layer_pe.pt +3 -0
pytorch_lora_weights.safetensors +3 -0
xvae/transparent_decoder_ckpt.pth +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assets/demo1.png filter=lfs diff=lfs merge=lfs -text
+assets/framework.png filter=lfs diff=lfs merge=lfs -text
+assets/pipeline.png filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,248 @@

+<div align="center">
+<div style="text-align: center;">
+    <img src="./assets/logo.png" alt="RevealLayer Logo" style="height: 96px;">
+    <h2>Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition</h2>
+</div>
+<div>
+  <strong>
+  Binhao Wang<sup>1,2,*</sup>,&nbsp;
+  Shihao Zhao<sup>1,2,*</sup>,&nbsp;
+  Bo Cheng<sup>2,*,†</sup>,&nbsp;
+  Qiuyu Ji<sup>1,2</sup>,&nbsp;
+  Yuhang Ma<sup>2</sup>,<br>
+  Liebucha Wu<sup>2</sup>,&nbsp;
+  Shanyuan Liu<sup>2</sup>,&nbsp;
+  Dawei Leng<sup>2,‡</sup>,&nbsp;
+  Yuhui Yin<sup>2</sup>
+  </strong>
+</div>
+<div>
+  <sup>1</sup>Wenzhou University&nbsp;&nbsp;&nbsp;
+  <sup>2</sup>360 AI Research
+</div>
+<div>
+  <sup>*</sup> Equal Contribution. &nbsp;
+  <sup>†</sup> Project Lead. &nbsp;
+  <sup>‡</sup> Corresponding Author.
+</div>
+<br>
+<div>
+  <a href="https://zhao0100.github.io/RevealLayer/" target="_blank">
+    <img src="https://img.shields.io/static/v1?label=Project%20Page&message=Github&color=blue&logo=github-pages">
+  </a>
+  &ensp;
+  <a href="TODO_ARXIV_LINK" target="_blank">
+    <img src="https://img.shields.io/static/v1?label=Paper&message=arXiv&color=red&logo=arxiv">
+  </a>
+  &ensp;
+  <a href="TODO_DATASET_LINK" target="_blank">
+    <img src="https://img.shields.io/static/v1?label=Dataset&message=RevealLayer&color=green">
+  </a>
+  &ensp;
+  <a href="TODO_MODEL_LINK" target="_blank">
+    <img src="https://img.shields.io/static/v1?label=Model&message=HuggingFace&color=yellow">
+  </a>
+</div>
+<br>
+<strong>
+RevealLayer decomposes an RGB image into multiple RGBA layers, enabling precise layer separation and reliable recovery of occluded content in natural scenes.
+</strong>
+<br><br>
+<div style="width: 100%; text-align: center; margin: auto;">
+    <img style="width:100%" src="assets/demo1.png" alt="RevealLayer teaser">
+</div>
+For more visual results, go checkout our <a href="https://zhao0100.github.io/RevealLayer/" target="_blank">project page</a>.
+---
+</div>
+## ⭐ Update
+- **[Coming Soon]** We will release the RevealLayer checkpoint and datasets.
+- **[Coming Soon]** We will release the paper and inference code.
+### ✅ TODO
+- [ ] Release models and datasets.
+- [ ] Release inference code and demo examples.
+---
+## 🎃 Overview
+RevealLayer focuses on occlusion-aware image layer decomposition, recovering visible and hidden RGBA layers from a single RGB image with region guidance.
+<div style="width: 100%; text-align: center; margin: auto;">
+    <img style="width:100%" src="assets/framework.png" alt="RevealLayer framework">
+</div>
+---
+## 📷 Datasets
+<div style="width: 100%; text-align: center; margin: auto;">
+    <img style="width:100%" src="assets/pipeline.png" alt="RevealLayer dataset pipeline">
+</div>
+We construct a large-scale multi-layer image decomposition dataset, including **RevealLayer-100K** for training and **RevealLayerBench** for evaluation. RevealLayer-100K contains 100K multi-layer natural image tuples with RGB images, background layers, RGBA foreground layers, and bounding boxes. RevealLayerBench contains 200 high-quality manually curated images, covering challenging cases such as complex occlusions, large-area objects, transparent materials, small foreground objects, and multi-layer scenes.
+🔥 We will release **RevealLayer-100K** and **RevealLayerBench** on [Hugging Face](TODO_DATASET_LINK). We hope they can serve as useful training and evaluation resources for future research on occlusion-aware image layer decomposition.
+> 🚩 The datasets are intended for research use. Please follow the license and terms provided with the released dataset.
+---
+## 🔧 Quick Start
+### 0. Experimental environment
+We tested our inference code with Python 3.10 and CUDA GPUs.
+### 1. Setup repository and environment
+```bash
+git clone https://github.com/Zhao0100/RevealLayer.git
+cd RevealLayer
+conda create -n reveallayer python=3.10
+conda activate reveallayer
+pip install -r requirements.txt
+pip install flash-attn --no-build-isolation
+cd diffusers
+pip install .
+cd ..
+```
+---
+## 📦 Prepare the models
+Model files are hosted with Git LFS, so please enable Git LFS before cloning model repositories.
+```bash
+git lfs install
+```
+Download the RevealLayer checkpoint:
+```bash
+git clone https://huggingface.co/qihoo360/RevealLayer models/RevealLayer
+```
+Download FLUX.1-dev:
+```bash
+git clone https://huggingface.co/black-forest-labs/FLUX.1-dev models/FLUX.1-dev
+```
+The expected model directory structure is:
+```text
+models
+├── RevealLayer
+│   ├── pytorch_lora_weights.safetensors
+│   ├── layer_pe.pt
+│   ├── Refiner.pt
+│   ├── xvae
+│   │   └── transparent_decoder_ckpt.pth
+│   └── ...
+├── FLUX.1-dev
+│   ├── transformer
+│   ├── vae
+│   ├── text_encoder
+│   ├── text_encoder_2
+│   ├── tokenizer
+│   ├── tokenizer_2
+│   └── ...
+```
+If your local model directory is different, please modify the corresponding paths in the inference script.
+---
+## 🗂️ Prepare input JSON
+The input JSON should contain a list of samples. Each sample should include the input image path and detected bounding boxes.
+Example:
+```json
+[
+  {
+    "imgid": "examples",
+    "full_image": "RevealLayer-Bench/examples/full_image.png",
+    "background": "RevealLayer-Bench/examples/background.png",
+    "LayerInfoRaw": [
+      "RevealLayer-Bench/examples/layer_0.png",
+      "RevealLayer-Bench/examples/layer_1.png"
+    ],
+    "detections": [
+      {
+        "bbox": [x1, y1, x2, y2]
+      },
+      {
+        "bbox": [x1, y1, x2, y2]
+      }
+    ]
+  }
+]
+```
+The expected fields are:
+```text
+imgid        : sample id
+full_image   : path to the input RGB image
+background   : path to the background image, optional for inference
+LayerInfoRaw : paths to the ground-truth RGBA layers, optional for inference
+detections   : detected foreground objects
+bbox         : bounding box in [x1, y1, x2, y2] format
+```
+---
+## ⚡ Inference
+Run inference with:
+```bash
+bash infer.sh 0
+```
+Before running, please make sure the paths in `infer.sh` and `infer_new.py` match your local model and data directories.
+---
+## 📑 Citation
+If you find our work useful for your research, please consider citing:
+```bibtex
+@inproceedings{wang2026reveallayer,
+  title={RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition},
+  author={Wang, Binhao and Zhao, Shihao and Cheng, Bo and Ji, Qiuyu and Ma, Yuhang and Wu, Liebucha and Liu, Shanyuan and Leng, Dawei and Yin, Yuhui},
+  booktitle={International Conference on Machine Learning},
+  year={2026}
+}
+```
+---
+## 📝 License
+This project is licensed under the [Apache License 2.0](LICENSE).

Refiner.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:752aff6cb62a44e35098b7e21eabf93744eb204e4ebb6d9615e3d1daffe40d3c
+size 56997104

assets/demo1.png ADDED Viewed

Git LFS Details

SHA256: 048e4ca6a2855870254286709ea832bb1add51280e1ead74827194cd948e444f
Pointer size: 132 Bytes
Size of remote file: 1.01 MB

assets/framework.png ADDED Viewed

Git LFS Details

SHA256: 84598f31f2d1f2fcba280dd0b721fa5078d634bd69f96a4e813b4a66c2d6d438
Pointer size: 131 Bytes
Size of remote file: 390 kB

assets/logo.png ADDED Viewed

assets/pipeline.png ADDED Viewed

Git LFS Details

SHA256: 3cba52c7d2e1722b991a52fea98179149b329d7ba4e3273fd49a30689c7ac08f
Pointer size: 131 Bytes
Size of remote file: 210 kB

layer_pe.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c3144b547bf8dad8246ee0070c4ea37b7d12cf5aa54f0aacae45ba3cc12cf6f0
+size 75312

pytorch_lora_weights.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f5069b954306f882d384c059fbc07b931d2d4d5c4d6d4d16e8d888f8c512bc7c
+size 298933416

xvae/transparent_decoder_ckpt.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:44653f514096dedf906354310d21ae9a62c812d79ac69f8dc4e6d7b8575ee8c3
+size 341128512