GitHub

Fine-tuned Whisper Small Nonstandard Kenyan Swahili πŸ‡°πŸ‡ͺ

Fine-tuned version of openai/whisper-small optimized for non-standard Kenyan Swahili speech, including speakers with speech impairments across varying severity levels and etiologies.

Key Features

  • 🎯 Specialized for non-standard Kenyan Swahili accents and speech patterns
  • πŸ“Š Trained on non-standard Kenyan Swahili speech data from cdli/kenyan_swahili_nonstandard_speech_v0.9
  • ⚑ ~5.7% relative improvement over baseline on test set WER
  • πŸŽ™οΈ Best performance among experimental configurations
  • 🧠 Strong generalization from development to test set

Performance vs. Baseline

Metric Baseline This Model (Run 3)
Test WER 31.4% 29.6%
Test CER 12.2% 11.8%

Development vs. Test Set Evaluation

The model was evaluated on both the development (272 examples) and test (554 examples) splits of cdli/kenyan_swahili_nonstandard_speech_v0.9 to assess generalization to unseen data.

Overall Results

Metric Dev Set Test Set Improvement
Overall WER 35.6% 30.0% -5.6%
Overall CER 15.1% 11.9% -3.2%

The model performs meaningfully better on the test set than the development set, suggesting it generalizes well to unseen speakers rather than overfitting to development examples. The gap may also partly reflect slightly less challenging audio conditions or impairment distributions in the test split.

Results by Severity

Severity Dev WER Test WER Trend
Mild 33% 25% βœ… Improved
Moderate 45% 32% βœ… Improved
Severe 28% 35% ❌ Degraded

An interesting pattern emerges across the two splits. In the development set, "Severe" cases surprisingly outperformed "Mild" ones β€” an anomaly likely driven by specific speaker outliers (e.g., speaker KES006). The test set corrects this, following the expected pattern where Mild < Moderate < Severe in terms of error rate. The one concern is that performance on Severe cases actually worsened (28% β†’ 35% WER), indicating the test set contains more challenging severe-impairment examples that the model has not fully learned to handle.

Results by Etiology

Etiology Dev WER Test WER Notes
Cerebral Palsy 40% 34% βœ… Improved
Multiple Sclerosis 9% 36% ❌ Major regression
Neurodevelopmental Disorder 42% 30% βœ… Improved
Parkinson's Disease 39% 18% βœ… Major improvement

Parkinson's Disease shows the most striking gain (39% β†’ 18% WER), likely reflecting differences between the dev and test speakers rather than a systematic model strength β€” the test speaker may present with milder or more consistent symptoms. Conversely, Multiple Sclerosis shows the sharpest regression (9% β†’ 36% WER). Cerebral Palsy and Neurodevelopmental Disorders follow the overall positive trend with consistent 6–12 point WER reductions.

Summary

Across both evaluation splits, the model validates its capability with a 30% overall WER on the test set. The severity anomaly present in development data resolved itself on the test set, providing a more reliable picture of how the model handles varying impairment levels. The key takeaway from the etiology analysis is that individual speaker characteristics remain the primary driver of performance variance β€” a known challenge in low-resource, dysarthric speech recognition where per-etiology speaker counts are small. Broader speaker diversity in future training data would be the most impactful path to reducing this variance.

Usage

from transformers import pipeline

transcriber = pipeline("automatic-speech-recognition",
                       "smainye/whisper-small-kenyan-swahili-nonstandard")
                       
# Transcribe Kenyan Swahili audio
result = transcriber("path/to/your/audio.wav")

# Get the transcription text
print("Transcription:", result["text"])

Contact Information

Linktree

Downloads last month
35
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for smainye/whisper-small-kenyan-swahili-nonstandard

Finetuned
(3446)
this model

Dataset used to train smainye/whisper-small-kenyan-swahili-nonstandard

Collection including smainye/whisper-small-kenyan-swahili-nonstandard