MVRL
/

sat2sound

@@ -34,6 +34,47 @@ Trained checkpoints and backbone weights for **Sat2Sound: A Unified Framework fo
 Checkpoints and backbones are resolved automatically by the codebase via `src/hub.py:resolve_hf_ckpt` — no manual download needed.
 ## Citation
 ```bibtex

 Checkpoints and backbones are resolved automatically by the codebase via `src/hub.py:resolve_hf_ckpt` — no manual download needed.
+## Quick-start: computing embeddings
+Clone the [code repo](https://github.com/MVRL/sat2sound), install the environment, then:
+```python
+import torch
+import torchaudio
+from src.engine import l2normalize
+from utilities.utils import load_sat2sound, encode_text, encode_gps_time, load_audio_mel, prepare_batch
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+B = 4
+model, tokenizer = load_sat2sound("bingmap_withmeta", device)
+# audio — swap the next two lines to use a real recording instead of white noise
+torchaudio.save("/tmp/demo.wav", torch.randn(1, 320_000), sample_rate=32_000)
+mel = load_audio_mel("/tmp/demo.wav", device)                  # (1, 1001, 64)
+latlong, time_enc, month_enc = encode_gps_time(37.77, -122.42, hour=13, month=5, B=B, device=device)
+batch = prepare_batch(
+    sat           = torch.randn(B, 3, 224, 224, device=device),  # ImageNet-normalised satellite tile
+    audio_mel     = mel,
+    audio_caption = encode_text(["Traffic noise and distant birds."] * B, tokenizer, device),
+    image_caption = encode_text(["An urban intersection with dense buildings."] * B, tokenizer, device),
+    latlong=latlong, time_enc=time_enc, month_enc=month_enc,
+)
+with torch.no_grad():
+    embeds = model.get_embeds(batch)
+sat_emb   = l2normalize(embeds["sat_embeds_dict"]["ctotal"])  # (B, 1024)
+audio_emb = l2normalize(embeds["audio_embeds"])               # (B, 1024)
+text_emb  = l2normalize(embeds["fdt_txt_embeds"])             # (B, 1024)
+print(sat_emb @ audio_emb.T)   # (B, B) satellite ↔ audio cosine similarity
+```
+> For `*_nometa` checkpoints omit `latlong`, `time_enc`, and `month_enc` (they default to `None`).
 ## Citation
 ```bibtex