--- license: cc-by-nc-4.0 task_categories: - image-segmentation tags: - glass-surface-detection - rgb-d - scene-understanding - pytorch pretty_name: RGBD-GSD-Net (RGB-D Glass Surface Detection Network) --- # RGBD-GSD-Net — RGB-D Glass Surface Detection Network Pre-trained weights for the model introduced in: > **Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection** > Jiaying Lin\*, Yuen-Hei Yeung\*, Shuquan Ye, Rynson W. H. Lau > AAAI 2025 > [arXiv](https://arxiv.org/abs/2206.11250) · [Project Page](https://jiaying.link/aaai2025-rgbdglass/) · [Dataset (RGBD-GSD)](https://huggingface.co/datasets/garrying/RGBD-GSD) ## Model Summary RGBD-GSD-Net detects glass surfaces by jointly processing RGB images and depth maps. It introduces two novel modules: - **Cross-Modal Context Mining (CCM)**: adaptively learns individual and mutual context features from RGB and depth information. - **Depth-Missing Aware Attention (DAA)**: explicitly exploits spatial locations where depth is missing (a strong indicator of glass surfaces) to guide detection. The backbone is a ResNeXt encoder shared across both modalities. | File | Description | |------|-------------| | `best.pth` | Best checkpoint (204 MB), saved as `{'model': state_dict, ...}` | | `results/our_best_results.zip` | Model predictions on the RGBD-GSD test set | ## Loading the Weights ```python import torch from networks.your_network import RGBDGlassNet # from the code release model = RGBDGlassNet() checkpoint = torch.load("best.pth", map_location="cpu") model.load_state_dict(checkpoint["model"]) model.eval() ``` Download the checkpoint: ```bash huggingface-cli download garrying/RGBD-GSD-Net best.pth --local-dir ./weights ``` ## Training Dataset This model was trained and evaluated on **RGBD-GSD**, the first large-scale RGB-D glass surface detection dataset: - 3,009 RGB-D images with binary glass surface masks and depth maps - Available at [garrying/RGBD-GSD](https://huggingface.co/datasets/garrying/RGBD-GSD) ## Citation ```bibtex @article{aaai2025_rgbdglass, author = {Lin, Jiaying and Yeung, Yuen-Hei and Ye, Shuquan and Lau, Rynson W.H.}, title = {Leveraging RGB-D Data with Cross-Modal Context Mining for Glass Surface Detection}, journal = {AAAI}, year = {2025}, } ``` ## License Non-commercial use only — [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/).