Model Description
This model was fine-tuned on over 98 hours of transcribed upper sorbian speech, including colloquial speech. Part of the corpus was augmented to create additional training data of 89 hours.
Training Data
- Sources:
- Załožba za serbski lud (https://zalozba.de/)
- Korla Baier (https://huggingface.co/Korla/tts-modele)
- Volume: 5900 minutes original data, combined with additional 5388 minutes of augmented data, 5% Validation Set
Training Details
- Hyperparameters:
- Batch size: 8
- Gradient accumulation steps: 4
- Learning rate: 5e-6, linear decay
- Warmup: 2000 steps
- Additional Techniques: BF16 training, initial 15 layers frozen
Performance
Metrics
- Model checkpoint: 8000
- Word Error Rate: 4.5
For a later checkpoint with better WER but worse robustness to noise, check this branch: https://huggingface.co/zalozbadev/whisper-large-v3-turbo-hsb-aug/tree/longest_trained
Usage
- Specify transcription language "czech" (model was finetuned this way)
Model Details
- Model Name: zalozbadev/whisper-large-v3-turbo-hsb-aug
- Publisher: Załožba za serbski lud
- Model Version: 1.0.0
- Model Date: 2026-16-01
- License: CC-BY-4.0
- Architecture: Whisper Large v3 Turbo
- Task: Automatic Speech Recognition
- Downloads last month
- 1
Model tree for zalozbadev/whisper-large-v3-turbo-hsb-aug
Base model
openai/whisper-large-v3 Finetuned
openai/whisper-large-v3-turbo