garrying commited on
Commit
e233680
·
verified ·
1 Parent(s): 2765045

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +72 -0
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ task_categories:
4
+ - image-segmentation
5
+ tags:
6
+ - glass-surface-detection
7
+ - rgb-d
8
+ - scene-understanding
9
+ - pytorch
10
+ pretty_name: RGBD-GSD-Net (RGB-D Glass Surface Detection Network)
11
+ ---
12
+
13
+ # RGBD-GSD-Net — RGB-D Glass Surface Detection Network
14
+
15
+ Pre-trained weights for the model introduced in:
16
+
17
+ > **Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection**
18
+ > Jiaying Lin\*, Yuen-Hei Yeung\*, Shuquan Ye, Rynson W. H. Lau
19
+ > AAAI 2025
20
+ > [arXiv](https://arxiv.org/abs/2206.11250) · [Project Page](https://jiaying.link/aaai2025-rgbdglass/) · [Dataset (RGBD-GSD)](https://huggingface.co/datasets/garrying/RGBD-GSD)
21
+
22
+ ## Model Summary
23
+
24
+ RGBD-GSD-Net detects glass surfaces by jointly processing RGB images and depth maps. It introduces two novel modules:
25
+
26
+ - **Cross-Modal Context Mining (CCM)**: adaptively learns individual and mutual context features from RGB and depth information.
27
+ - **Depth-Missing Aware Attention (DAA)**: explicitly exploits spatial locations where depth is missing (a strong indicator of glass surfaces) to guide detection.
28
+
29
+ The backbone is a ResNeXt encoder shared across both modalities.
30
+
31
+ | File | Description |
32
+ |------|-------------|
33
+ | `best.pth` | Best checkpoint (204 MB), saved as `{'model': state_dict, ...}` |
34
+ | `results/our_best_results.zip` | Model predictions on the RGBD-GSD test set |
35
+
36
+ ## Loading the Weights
37
+
38
+ ```python
39
+ import torch
40
+ from networks.your_network import RGBDGlassNet # from the code release
41
+
42
+ model = RGBDGlassNet()
43
+ checkpoint = torch.load("best.pth", map_location="cpu")
44
+ model.load_state_dict(checkpoint["model"])
45
+ model.eval()
46
+ ```
47
+
48
+ Download the checkpoint:
49
+ ```bash
50
+ huggingface-cli download garrying/RGBD-GSD-Net best.pth --local-dir ./weights
51
+ ```
52
+
53
+ ## Training Dataset
54
+
55
+ This model was trained and evaluated on **RGBD-GSD**, the first large-scale RGB-D glass surface detection dataset:
56
+ - 3,009 RGB-D images with binary glass surface masks and depth maps
57
+ - Available at [garrying/RGBD-GSD](https://huggingface.co/datasets/garrying/RGBD-GSD)
58
+
59
+ ## Citation
60
+
61
+ ```bibtex
62
+ @article{aaai2025_rgbdglass,
63
+ author = {Lin, Jiaying and Yeung, Yuen-Hei and Ye, Shuquan and Lau, Rynson W.H.},
64
+ title = {Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection},
65
+ journal = {AAAI},
66
+ year = {2025},
67
+ }
68
+ ```
69
+
70
+ ## License
71
+
72
+ Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).