Fongbe ASR model with diacritics

How to use for inference

from speechbrain.inference.ASR import EncoderASR

asr_model = EncoderASR.from_hparams(
    source="whettenr/asr-fon-with-diacritics-bpe-256",
    savedir="pretrained_models/asr-fongbe-with-diacritics-bpe-256"
)

asr_model.transcribe_file("whettenr/asr-fongbe-with-diacritics-bpe-256/example.wav")

# expected output:
# huzuhuzu gɔngɔn ɖé ɖò dandan

Details of model

~100M parameters, 12 layer conformer encoder, FFNN decoder

Details of training

  • pretrained using BEST-RQ on 700 hours for 400k steps
    • 140 hours of Fongbé from:
      • FFSTC 2 + beethogedeon/fongbe-speech (~40 hours)
      • cappfm (~100 hours)
    • 140 hours of English and French (from Librispeech)
    • 140 hours of Hausa and Yoruba from VoxLingua107,CommonVoice 23.0 and BibleTTS
  • finetuned with CTC loss on training sets of
    • FFSTC 2
@inproceedings{kponou25_interspeech,
  title     = {{Extending the Fongbe to French Speech Translation Corpus:  resources, models and benchmark}},
  author    = {D. Fortuné Kponou and Salima Mdhaffar and Fréjus A. A. Laleye and Eugène C. Ezin and Yannick Estève},
  year      = {2025},
  booktitle = {{Interspeech 2025}},
  pages     = {4533--4537},
  doi       = {10.21437/Interspeech.2025-1801},
  issn      = {2958-1796},
}
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train whettenr/asr-fon-with-diacritics-bpe-256