Fongbe ASR model with diacritics

How to use for inference

from speechbrain.inference.ASR import EncoderASR

asr_model = EncoderASR.from_hparams(
    source="whettenr/asr-fon-with-diacritics-bpe-256",
    savedir="pretrained_models/asr-fongbe-with-diacritics-bpe-256"
)

asr_model.transcribe_file("whettenr/asr-fongbe-with-diacritics-bpe-256/example.wav")

# expected output:
# huzuhuzu gɔngɔn ɖé ɖò dandan

Details of model

~100M parameters, 12 layer conformer encoder, FFNN decoder

Details of training

pretrained using BEST-RQ on 700 hours for 400k steps
- 140 hours of Fongbé from:
  - FFSTC 2 + beethogedeon/fongbe-speech (~40 hours)
  - cappfm (~100 hours)
- 140 hours of English and French (from Librispeech)
- 140 hours of Hausa and Yoruba from VoxLingua107,CommonVoice 23.0 and BibleTTS
finetuned with CTC loss on training sets of
- FFSTC 2

@inproceedings{kponou25_interspeech,
  title     = {{Extending the Fongbe to French Speech Translation Corpus:  resources, models and benchmark}},
  author    = {D. Fortuné Kponou and Salima Mdhaffar and Fréjus A. A. Laleye and Eugène C. Ezin and Yannick Estève},
  year      = {2025},
  booktitle = {{Interspeech 2025}},
  pages     = {4533--4537},
  doi       = {10.21437/Interspeech.2025-1801},
  issn      = {2958-1796},
}

Downloads last month: 5

whettenr
/

asr-fon-with-diacritics-bpe-256

Fongbe ASR model with diacritics

How to use for inference

Details of model

Details of training

Dataset used to train whettenr/asr-fon-with-diacritics-bpe-256