Parakeet Medical DE
German medical ASR model fine-tuned from nvidia/parakeet-tdt-0.6b-v3 on Mediform/medical_asr_de.
This is the best validation checkpoint (epoch 14, val_wer=11.55%). For the last epoch, see Mediform/parakeet-medical-de-e20.
Performance
| Model | Test WER | Val WER |
|---|---|---|
| Baseline (parakeet-tdt-0.6b-v3) | 26.17% | - |
| This model (epoch 14, best val) | 11.31% | 11.55% |
| Last epoch (epoch 20) | 11.24% | 11.59% |
14.86% absolute WER improvement over the English-only baseline on German medical speech.
Training Details
- Base model: nvidia/parakeet-tdt-0.6b-v3 (627M params, EncDecRNNTBPEModel)
- Dataset: Mediform/medical_asr_de (14,388 train / 799 val / 799 test samples, 117h)
- Hardware: 4x NVIDIA A40 (48GB)
- Strategy: DDP, bf16-mixed precision
- Batch size: 2/GPU x 4 GPUs x 8 accumulation = 64 effective
- Optimizer: AdamW (lr=5e-5, cosine annealing, 200 warmup steps)
- Epochs: 20 (best at epoch 14)
- Spec augment: freq_masks=2, time_masks=10
Training Curve
| Epoch | Val WER |
|---|---|
| 0 | 16.56% |
| 1 | 13.21% |
| 2 | 13.01% |
| 3 | 12.80% |
| 4 | 11.95% |
| 8 | 11.82% |
| 14 | 11.55% |
| 19 | 11.59% |
Usage with MLX (Apple Silicon)
Weights are in SafeTensors format, compatible with mlx-audio-swift for on-device inference.
Medical Domain Coverage
The training dataset covers 529 unique German medical terms including drug names, conditions, procedures, symptoms, and anatomy. Sources: VoxPopuli (EU Parliament medical debates), MultiMed (patient-doctor dialogues), CommonVoice, Spoken Wikipedia, M-AILABS, TuDa, Voxforge.
- Downloads last month
- 146
Hardware compatibility
Log In to add your hardware
Quantized
Model tree for Mediform/parakeet-medical-de
Base model
nvidia/parakeet-tdt-0.6b-v3