| --- |
| license: creativeml-openrail-m |
| datasets: |
| - DarthReca/crisislandmark |
| language: |
| - en |
| library_name: torchgeo |
| tags: |
| - remote-sensing |
| - text-to-image-retrieval |
| - multimodal |
| - geospatial |
| - SAR |
| - multispectral |
| - crisis-management |
| - earth-observation |
| - contrastive-learning |
| --- |
| # CLOSP |
|
|
| CLOSP (Contrastive Language Optical SAR Pretraining) is a multimodal architecture designed for text-to-image retrieval. |
| It creates a unified embedding space for text, Sentinel-2 (MSI), and Sentinel-1 (SAR) data. |
|
|
| This repository contains all the separate visual encoders in PyTorch format. |
|
|
| ## Model Details |
| The model uses three separate encoders: one for text, one for Sentinel-1 (SAR) data, and one for Sentinel-2 (MSI) data. |
| During training, it uses a contrastive objective to align the textual embeddings with the corresponding visual embeddings (either SAR or MSI). |
|
|
|
|
| - **Developed by:** Daniele Rege Cambrin |
| - **Model type:** CLOSP |
| - **Language(s) (NLP):** english |
| - **License:** CreativeML-OpenRAIL-M |
| - **Repository:** [GitHub](https://github.com/DarthReca/closp) |
| - **Paper:** [ArXiv](https://arxiv.org/abs/2507.10403) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{cambrin2025texttoremotesensingimageretrievalrgbsources, |
| title={Text-to-Remote-Sensing-Image Retrieval beyond RGB Sources}, |
| author={Daniele Rege Cambrin and Lorenzo Vaiani and Giuseppe Gallipoli and Luca Cagliero and Paolo Garza}, |
| year={2025}, |
| eprint={2507.10403}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CV}, |
| url={https://arxiv.org/abs/2507.10403}, |
| } |
| ``` |