whisper-base-id / README.md
Sparkplugx1904's picture
Create README.md
af3252b verified
---
language:
- id
base_model:
- openai/whisper-base
pipeline_tag: automatic-speech-recognition
datasets:
- mozilla-foundation/common_voice_23_0
---
# Whisper Base Model – Indonesian ASR
## Model Description
This model is a fine-tuned version of **openai/whisper-base** for **Automatic Speech Recognition (ASR)** in **Indonesian (id)**.
It supports transcription of Indonesian speech into text across various audio conditions, with performance and resource usage depending on the selected model size.
## Intended Use
- Indonesian speech-to-text transcription
- Research and experimentation
- Educational and academic purposes
- Application development and benchmarking
Model variants (tiny, base, small, medium, large) differ in accuracy, speed, and hardware requirements. Users should select the size that best matches their constraints and objectives.
## Limitations
- Transcription quality depends on audio clarity, speaker accent, and background noise
- Smaller variants may produce higher error rates on long or complex audio
- Larger variants require significantly more compute and memory
- Outputs should be reviewed before use in critical or high-risk applications
## Training Data
This model was fine-tuned using **Mozilla Common Voice v23.0 (Indonesian)**.
Common Voice is a publicly available, community-driven speech dataset released by Mozilla under a permissive license.
Dataset characteristics such as speaker diversity, recording quality, and utterance length may influence model behavior.
## Evaluation
The model is typically evaluated using **Word Error Rate (WER)**.
Evaluation results may vary depending on dataset, domain, audio conditions, and model size.
## Training results
| Step | Training Loss |
|------|---------------|
| 100 | 0.880500 |
| 200 | 0.472300 |
| 300 | 0.408100 |
| 400 | 0.328500 |
| 500 | 0.226000 |
| 600 | 0.237500 |
| 700 | 0.148600 |
| 800 | 0.111600 |
| 900 | 0.104900 |
| 1000 | 0.073900 |
| 1100 | 0.063100 |
| 1200 | 0.050300 |
| 1400 | 0.039800 |
| 1500 | 0.031000 |
| 1550 | 0.031400 |