Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
library_name: pytorch
|
| 4 |
+
tags:
|
| 5 |
+
- audio
|
| 6 |
+
- deepfake-detection
|
| 7 |
+
- icml-2026
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# SONAR weights
|
| 11 |
+
|
| 12 |
+
Pretrained checkpoints for *SONAR: Spectral-Contrastive Audio Residuals
|
| 13 |
+
for Generalizable Deepfake Detection* (ICML 2026).
|
| 14 |
+
|
| 15 |
+
| File | ITW EER | Architecture | License |
|
| 16 |
+
|---|---:|---|---|
|
| 17 |
+
| `xlsr2_300m.pt` | — | XLSR-300M backbone (fairseq, derivative of [facebookresearch/fairseq](https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec/xlsr)). | CC-BY-NC-4.0 (upstream) |
|
| 18 |
+
| `baseline_xlsr_aasist.pth` | ~10.5% | Single XLSR + AASIST baseline (paper Table 1 row "XLSR+AASIST"). | CC-BY-NC-4.0 |
|
| 19 |
+
| `sonar_full_xlsr_aasist_eer6.pth` | **6.0%** | SONAR-Full: dual XLSR + RFE + cross-attention + AASIST + JS-alignment loss. Matches `guided_model.GuidedModel`. | CC-BY-NC-4.0 |
|
| 20 |
+
| `sonar_finetune_xlsr_mamba_eer5p5.pth` | **5.5%** | SONAR-Finetune: frozen XLSR-Mamba content branch + RFE/NFE + cross-attention + Conformer head + JS-alignment loss. | CC-BY-NC-4.0 |
|
| 21 |
+
|
| 22 |
+
Code: <https://github.com/idonithid/SONAR-Audio-DF-Detection>
|
| 23 |
+
Project page: <https://idonithid.github.io/SONAR-Audio-DF-Detection/>
|
| 24 |
+
|
| 25 |
+
## Loading
|
| 26 |
+
|
| 27 |
+
```python
|
| 28 |
+
from huggingface_hub import hf_hub_download
|
| 29 |
+
import torch
|
| 30 |
+
from argparse import Namespace
|
| 31 |
+
from sonar.guided_model import GuidedModel
|
| 32 |
+
|
| 33 |
+
ckpt = hf_hub_download(repo_id="idonithid/SONAR-weights",
|
| 34 |
+
filename="sonar_full_xlsr_aasist_eer6.pth")
|
| 35 |
+
xlsr = hf_hub_download(repo_id="idonithid/SONAR-weights",
|
| 36 |
+
filename="xlsr2_300m.pt")
|
| 37 |
+
import os; os.environ["SONAR_XLSR_CKPT"] = xlsr
|
| 38 |
+
|
| 39 |
+
model = GuidedModel(Namespace(algo=4, batch_size=1, device="cuda"), "cuda").cuda()
|
| 40 |
+
model.load_state_dict(torch.load(ckpt, map_location="cuda"), strict=False)
|
| 41 |
+
model.eval()
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Citation
|
| 45 |
+
|
| 46 |
+
```bibtex
|
| 47 |
+
@inproceedings{hidekel2026sonar,
|
| 48 |
+
title = {{SONAR}: Spectral-Contrastive Audio Residuals for Generalizable Deepfake Detection},
|
| 49 |
+
author = {Hidekel, Ido Nitzan and Lifshitz, Gal and Cohen, Khen and Raviv, Dan},
|
| 50 |
+
booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
|
| 51 |
+
year = {2026}
|
| 52 |
+
}
|
| 53 |
+
```
|