Model Description

This model was fine-tuned on over 98 hours of transcribed upper sorbian speech, including colloquial speech. Part of the corpus was augmented to create additional training data of 89 hours.

Training Data

Sources:
- Załožba za serbski lud (https://zalozba.de/)
- Korla Baier (https://huggingface.co/Korla/tts-modele)
Volume: 5900 minutes original data, combined with additional 5388 minutes of augmented data, 5% Validation Set

Training Details

Hyperparameters:
- Batch size: 8
- Gradient accumulation steps: 4
- Learning rate: 5e-6, linear decay
Warmup: 2000 steps
Additional Techniques: BF16 training, initial 15 layers frozen

Performance

Metrics

Model checkpoint: 8000
Word Error Rate: 4.5

For a later checkpoint with better WER but worse robustness to noise, check this branch: https://huggingface.co/zalozbadev/whisper-large-v3-turbo-hsb-aug/tree/longest_trained

Usage

Specify transcription language "czech" (model was finetuned this way)

Model Details

Model Name: zalozbadev/whisper-large-v3-turbo-hsb-aug
Publisher: Załožba za serbski lud
Model Version: 1.0.0
Model Date: 2026-16-01
License: CC-BY-4.0
Architecture: Whisper Large v3 Turbo
Task: Automatic Speech Recognition

Downloads last month: 1

Safetensors

Model size

0.8B params

Tensor type

F32

Model tree for zalozbadev/whisper-large-v3-turbo-hsb-aug

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Finetuned

(512)

this model