Create README.md

f9c5fd8 verified 5 months ago

2.03 kB

language:
  - id
base_model:
  - openai/whisper-tiny
pipeline_tag: automatic-speech-recognition
datasets:
  - mozilla-foundation/common_voice_23_0

Whisper Tiny Model – Indonesian ASR

Model Description

This model is a fine-tuned version of openai/whisper-tiny for Automatic Speech Recognition (ASR) in Indonesian (id).
It supports transcription of Indonesian speech into text across various audio conditions, with performance and resource usage depending on the selected model size.

Intended Use

Indonesian speech-to-text transcription
Research and experimentation
Educational and academic purposes
Application development and benchmarking

Model variants (tiny, base, small, medium, large) differ in accuracy, speed, and hardware requirements. Users should select the size that best matches their constraints and objectives.

Limitations

Transcription quality depends on audio clarity, speaker accent, and background noise
Smaller variants may produce higher error rates on long or complex audio
Larger variants require significantly more compute and memory
Outputs should be reviewed before use in critical or high-risk applications

Training Data

This model was fine-tuned using Mozilla Common Voice v23.0 (Indonesian).
Common Voice is a publicly available, community-driven speech dataset released by Mozilla under a permissive license.
Dataset characteristics such as speaker diversity, recording quality, and utterance length may influence model behavior.

Evaluation

The model is typically evaluated using Word Error Rate (WER).
Evaluation results may vary depending on dataset, domain, audio conditions, and model size.

Training results

Step	Training Loss
100	1.282900
200	0.682300
300	0.568900
400	0.487500
500	0.372700
600	0.375500
700	0.276200
800	0.226000
900	0.223800
1000	0.188600
1100	0.164300
1200	0.151400
1300	0.130000
1400	0.133900
1500	0.119700
1550	0.117300