Subash-Khanal commited on
Commit
dde037d
·
verified ·
1 Parent(s): b929ef1

Add model card with citation and file index

Browse files
Files changed (1) hide show
  1. README.md +48 -0
README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - satellite-imagery
5
+ - audio
6
+ - multimodal
7
+ - contrastive-learning
8
+ - soundscape
9
+ - remote-sensing
10
+ ---
11
+
12
+ # Sat2Sound
13
+
14
+ Trained checkpoints and backbone weights for **Sat2Sound: A Unified Framework for Zero-Shot Soundscape Mapping**, accepted at EarthVision 2026 (IEEE/ISPRS Workshop on Large Scale Computer Vision for Remote Sensing).
15
+
16
+ - Paper: [arxiv.org/pdf/2505.13777](https://arxiv.org/pdf/2505.13777)
17
+ - Code: [github.com/MVRL/sat2sound](https://github.com/MVRL/sat2sound)
18
+
19
+ ## Files
20
+
21
+ | Path | Description |
22
+ |---|---|
23
+ | `sat2sound/bingmap_nometa.ckpt` | GeoSound-Bing, no metadata |
24
+ | `sat2sound/bingmap_withmeta.ckpt` | GeoSound-Bing, with metadata |
25
+ | `sat2sound/sentinel_nometa.ckpt` | GeoSound-Sentinel, no metadata |
26
+ | `sat2sound/sentinel_withmeta.ckpt` | GeoSound-Sentinel, with metadata |
27
+ | `sat2sound/SoundingEarth_nometa.ckpt` | SoundingEarth, no metadata |
28
+ | `sat2sound/SoundingEarth_withmeta.ckpt` | SoundingEarth, with metadata |
29
+ | `sat2text/bingmap_i2t_baseline.ckpt` | Sat2Text image-text baseline |
30
+ | `backbones/pretrain-vit-base-e199.pth` | SatMAE ViT-Base backbone |
31
+ | `backbones/mga-clap.pt` | MGACLAP audio encoder backbone |
32
+ | `demo/GeoSound_gallery_w_bingmap.h5` | Retrieval demo gallery (9,931 samples) |
33
+ | `ckpt_cfg.json` | Experiment name → checkpoint path mapping |
34
+
35
+ Checkpoints and backbones are resolved automatically by the codebase via `src/hub.py:resolve_hf_ckpt` — no manual download needed.
36
+
37
+ ## Citation
38
+
39
+ ```bibtex
40
+ @inproceedings{khanal2026sat2sound,
41
+ title = {{Sat2Sound}: A Unified Framework for Zero-Shot Soundscape Mapping},
42
+ author = {Khanal, Subash and Sastry, Srikumar and Dhakal, Aayush and
43
+ Ahmad, Adeel and Stylianou, Abby and Jacobs, Nathan},
44
+ booktitle = {IEEE/ISPRS Workshop: Large Scale Computer Vision for
45
+ Remote Sensing (EarthVision)},
46
+ year = {2026},
47
+ }
48
+ ```