Whisper Twi v2

Whisper Twi v2 is an improved fine-tuned version of openai/whisper-small designed for automatic speech recognition (ASR) in Akan.
This version focuses on improving transcription accuracy for Twi speech by refining the training process and optimizing preprocessing and decoding strategies for a low-resource language setting.

The goal of this model is to help expand speech technology support for African languages, particularly Akan, which is widely spoken in Ghana but currently underrepresented in modern ASR systems.

Model Details

Model Description

Whisper Twi v2 adapts the Whisper transformer-based speech recognition architecture to the Twi language. By fine-tuning the multilingual Whisper model on Twi speech data, the model learns to better represent Twi phonetics and vocabulary patterns.

Developed by: Tiffany Degbotse
Funded by: Academic research / independent project
Shared by: Tiffany Degbotse
Model type: Automatic Speech Recognition (Speech-to-Text)
Language(s): Twi (Akan)
License: Apache-2.0
Finetuned from model: openai/whisper-small

Model Sources

Repository: https://huggingface.co/tiffany101/whisper-twi-v2

Uses

Direct Use

This model can be used for:

Transcribing spoken Akan audio into text
Building voice interfaces for Twi speakers
Speech accessibility tools
Linguistic analysis of Akan speech data

Downstream Use

Possible downstream applications include:

Fine-tuning on additional Twi datasets
Multilingual speech recognition systems
Voice assistants supporting Ghanaian languages
Speech-enabled applications for education and accessibility

Out-of-Scope Use

The model may perform poorly when:

Audio recordings are extremely noisy
Speakers frequently code-switch between multiple languages
Speech contains dialects not represented in the training data

Bias, Risks, and Limitations

Since the model is trained on a limited dataset, performance may vary across accents, speaking styles, and recording environments. Some dialects of Twi may be underrepresented, which can lead to reduced transcription accuracy for certain speakers.

Recommendations

Users should evaluate the model on their own datasets before deploying it in production environments and remain aware of potential performance differences across speakers and audio conditions.

How to Get Started with the Model

from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="tiffany101/whisper-twi-v2"
)

pipe("audio.wav")

Downloads last month: 19

Safetensors

Model size

37.8M params

Tensor type

F32

Model tree for tiffany101/whisper-twi-v2

Base model

openai/whisper-small

Finetuned

(3435)

this model