Wolof-HuBERT-CTC

This model is a fine-tuned version of soynade-research/Wolof-HuBERT-Base. It achieves the following results on a challenging evaluation set:

  • Loss: 0.4031
  • Wer: 0.3565

It outperforms HuBERT models by Meta and Orange.

Usage

import torch
from transformers import pipeline

pipeline = pipeline(
    task="automatic-speech-recognition",
    model="soynade-research/Wolof-HuBERT-CTC",
    dtype=torch.float16,
    device=0
)

pipeline("https://huggingface.co/soynade-research/Wolof-HuBERT-CTC/resolve/main/story.wav")

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Wer
0.6414 1.1804 10000 0.5430 0.5107
0.3998 2.3607 20000 0.4524 0.4453
0.3896 3.5411 30000 0.4002 0.4217
0.3129 4.7214 40000 0.3863 0.3971
0.2628 5.9018 50000 0.3912 0.3798
0.2275 7.0822 60000 0.3817 0.3717
0.2031 8.2625 70000 0.3872 0.3639
0.1619 9.4429 80000 0.4062 0.3592

Framework versions

  • Transformers 4.56.0.dev0
  • Pytorch 2.8.0+cu128
  • Datasets 2.20.0
  • Tokenizers 0.21.4

How to Cite

If you use this model, please cite:

@misc{sy2025speechlanguagemodelsunderrepresented,
      title={Speech Language Models for Under-Represented Languages: Insights from Wolof}, 
      author={Yaya Sy and Dioula Doucouré and Christophe Cerisara and Irina Illina},
      year={2025},
      eprint={2509.15362},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.15362}, 
}
Downloads last month
57
Safetensors
Model size
94.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for soynade-research/Wolof-HuBERT-CTC

Finetuned
(1)
this model

Space using soynade-research/Wolof-HuBERT-CTC 1

Collection including soynade-research/Wolof-HuBERT-CTC

Paper for soynade-research/Wolof-HuBERT-CTC