Whisper Small — Kikuyu (Gikuyu) V4

Fine-tuned openai/whisper-small for Kikuyu (Gikuyu) automatic speech recognition.

Model Details

  • Base model: openai/whisper-small (244M params)
  • Fine-tuning: LoRA r=32, alpha=64, targets: q/k/v/out_proj/fc1/fc2
  • Dataset: google/WaxalNLP kik_tts (~1,226 train samples, ~6.5 hours)
  • Diacritics: Preserved (canonical Ä©, Å©)
  • Training: 1,500 steps, cosine LR schedule, SpecAugment, label smoothing 0.1

Results

Split WER CER
Eval 78.2% 40.4%
Test 83.6% 47.8%

Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration

processor = WhisperProcessor.from_pretrained("Kiragu/whisper-small-kikuyu-v4")
model = WhisperForConditionalGeneration.from_pretrained("Kiragu/whisper-small-kikuyu-v4")

input_features = processor(audio_array, sampling_rate=16000, return_tensors="pt").input_features
predicted_ids = model.generate(input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]

Limitations

  • WER is still high (78-84%) — experimental model
  • Only ~6.5 hours of training data
  • Short phrases work better than long sentences
  • May hallucinate/repeat on longer audio

Part of Sauti AI

This model is part of Sauti AI, a decentralized vocal data collection platform for underrepresented languages.

Downloads last month
79
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kiragu/whisper-small-kikuyu-v4

Adapter
(219)
this model

Dataset used to train Kiragu/whisper-small-kikuyu-v4

Evaluation results