| --- |
| license: cc-by-nc-4.0 |
| task_categories: |
| - image-segmentation |
| tags: |
| - glass-surface-detection |
| - rgb-d |
| - scene-understanding |
| - pytorch |
| pretty_name: RGBD-GSD-Net (RGB-D Glass Surface Detection Network) |
| --- |
| |
| # RGBD-GSD-Net — RGB-D Glass Surface Detection Network |
|
|
| Pre-trained weights for the model introduced in: |
|
|
| > **Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection** |
| > Jiaying Lin\*, Yuen-Hei Yeung\*, Shuquan Ye, Rynson W. H. Lau |
| > AAAI 2025 |
| > [arXiv](https://arxiv.org/abs/2206.11250) · [Project Page](https://jiaying.link/aaai2025-rgbdglass/) · [Dataset (RGBD-GSD)](https://huggingface.co/datasets/garrying/RGBD-GSD) |
|
|
| ## Model Summary |
|
|
| RGBD-GSD-Net detects glass surfaces by jointly processing RGB images and depth maps. It introduces two novel modules: |
|
|
| - **Cross-Modal Context Mining (CCM)**: adaptively learns individual and mutual context features from RGB and depth information. |
| - **Depth-Missing Aware Attention (DAA)**: explicitly exploits spatial locations where depth is missing (a strong indicator of glass surfaces) to guide detection. |
|
|
| The backbone is a ResNeXt encoder shared across both modalities. |
|
|
| | File | Description | |
| |------|-------------| |
| | `best.pth` | Best checkpoint (204 MB), saved as `{'model': state_dict, ...}` | |
| | `results/our_best_results.zip` | Model predictions on the RGBD-GSD test set | |
|
|
| ## Loading the Weights |
|
|
| ```python |
| import torch |
| from networks.your_network import RGBDGlassNet # from the code release |
| |
| model = RGBDGlassNet() |
| checkpoint = torch.load("best.pth", map_location="cpu") |
| model.load_state_dict(checkpoint["model"]) |
| model.eval() |
| ``` |
|
|
| Download the checkpoint: |
| ```bash |
| huggingface-cli download garrying/RGBD-GSD-Net best.pth --local-dir ./weights |
| ``` |
|
|
| ## Training Dataset |
|
|
| This model was trained and evaluated on **RGBD-GSD**, the first large-scale RGB-D glass surface detection dataset: |
| - 3,009 RGB-D images with binary glass surface masks and depth maps |
| - Available at [garrying/RGBD-GSD](https://huggingface.co/datasets/garrying/RGBD-GSD) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{aaai2025_rgbdglass, |
| author = {Lin, Jiaying and Yeung, Yuen-Hei and Ye, Shuquan and Lau, Rynson W.H.}, |
| title = {Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection}, |
| journal = {AAAI}, |
| year = {2025}, |
| } |
| ``` |
|
|
| ## License |
|
|
| Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). |
|
|