Whisper Small — Kikuyu (Gikuyu) V4
Fine-tuned openai/whisper-small for Kikuyu (Gikuyu) automatic speech recognition.
Model Details
- Base model: openai/whisper-small (244M params)
- Fine-tuning: LoRA r=32, alpha=64, targets: q/k/v/out_proj/fc1/fc2
- Dataset: google/WaxalNLP kik_tts (~1,226 train samples, ~6.5 hours)
- Diacritics: Preserved (canonical Ä©, Å©)
- Training: 1,500 steps, cosine LR schedule, SpecAugment, label smoothing 0.1
Results
| Split | WER | CER |
|---|---|---|
| Eval | 78.2% | 40.4% |
| Test | 83.6% | 47.8% |
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
processor = WhisperProcessor.from_pretrained("Kiragu/whisper-small-kikuyu-v4")
model = WhisperForConditionalGeneration.from_pretrained("Kiragu/whisper-small-kikuyu-v4")
input_features = processor(audio_array, sampling_rate=16000, return_tensors="pt").input_features
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
Limitations
- WER is still high (78-84%) — experimental model
- Only ~6.5 hours of training data
- Short phrases work better than long sentences
- May hallucinate/repeat on longer audio
Part of Sauti AI
This model is part of Sauti AI, a decentralized vocal data collection platform for underrepresented languages.
- Downloads last month
- 79
Model tree for Kiragu/whisper-small-kikuyu-v4
Base model
openai/whisper-smallDataset used to train Kiragu/whisper-small-kikuyu-v4
Evaluation results
- Test WER on WAXAL Kikuyu TTStest set self-reported83.600
- Test CER on WAXAL Kikuyu TTStest set self-reported47.800