Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

README.md +67 -0
config.json +69 -0
final.pt +3 -0
full_checkpoint.pt +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,67 @@

+---
+license: cc-by-nc-sa-4.0
+language:
+- bg
+- cs
+- da
+- el
+- es
+- et
+- fi
+- hr
+- hu
+- it
+- lt
+- lv
+- mt
+- nl
+- pl
+- pt
+- ro
+- sk
+- sl
+- sv
+---
+# SpidR VP-20
+SpidR VP-20 is a SpidR model pretrained  pretrained on a subset of 6k hours and 20 languages of VoxPopuli
+(all EU languages except English, French, and German)
+for the [DiscoPhon benchmark](https://benchmarks.cognitive-ml.fr/discophon).
+It was pretrained using the [`spidr`](https://github.com/facebookresearch/spidr) library.
+You can load it with:
+```python
+from spidr.models import SpidR
+from torch.hub import load_state_dict_from_url
+state_dict = load_state_dict_from_url("https://huggingface.co/coml/spidr-vp20/resolve/main/final.pt")
+model = SpidR().eval()
+model.load_state_dict(state_dict)
+```
+## Files:
+- `config.json`: Model configuration.
+- `final.pt`: Model checkpoint.
+- `full_checkpoint.pt`: Full checkpoint, with model, optimizer, etc.
+## Citing
+Please cite the DiscoPhon paper
+```bibtex
+@misc{poli2026discophon,
+  title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
+  author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
+  year={2026},
+  eprint={2603.18612},
+  archivePrefix={arXiv},
+  primaryClass={cs.CL},
+  url={https://arxiv.org/abs/2603.18612},
+}
+```
+along with [SpidR](https://openreview.net/forum?id=E7XAFBpfZs).

config.json ADDED Viewed

	@@ -0,0 +1,69 @@

+{
+    "model": {
+        "extractor_mode": "layer_norm",
+        "extractor_conv_bias": false,
+        "extractor_conv_layer_config": [
+            [
+                512,
+                10,
+                5
+            ],
+            [
+                512,
+                3,
+                2
+            ],
+            [
+                512,
+                3,
+                2
+            ],
+            [
+                512,
+                3,
+                2
+            ],
+            [
+                512,
+                3,
+                2
+            ],
+            [
+                512,
+                2,
+                2
+            ],
+            [
+                512,
+                2,
+                2
+            ]
+        ],
+        "encoder_embed_dim": 768,
+        "encoder_projection_dropout": 0,
+        "encoder_pos_conv_kernel": 95,
+        "encoder_pos_conv_groups": 16,
+        "encoder_pos_conv_depth": 5,
+        "encoder_num_layers": 12,
+        "encoder_num_heads": 12,
+        "encoder_attention_dropout": 0.1,
+        "encoder_ff_interm_features": 3072,
+        "encoder_ff_interm_dropout": 0.0,
+        "encoder_dropout": 0.1,
+        "encoder_layer_norm_first": false,
+        "encoder_layer_drop": 0.0,
+        "encoder_qkv_bias": false,
+        "codebook_size": 256,
+        "codebook_decay": 0.9,
+        "num_codebooks": 8,
+        "ema_start_decay": 0.999,
+        "ema_final_decay": 0.9999,
+        "ema_final_step": 30000,
+        "ema_exclude_layers": [
+            "pos_conv_embed"
+        ],
+        "freeze_step": 200000,
+        "ema_timescale": 20000,
+        "ema_threshold": 1e-07
+    }
+}

final.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6d228d96def90c94a3e12e04bb6245354be960003948c905f0bc1b5c8d40ac59
+size 739393927

full_checkpoint.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:661e0e193efd008ff26bc3eb55a7866a3269724e91fc29fb69fe989813a01e87
+size 1460369619