Parakeet Medical DE

German medical ASR model fine-tuned from nvidia/parakeet-tdt-0.6b-v3 on Mediform/medical_asr_de.

This is the best validation checkpoint (epoch 14, val_wer=11.55%). For the last epoch, see Mediform/parakeet-medical-de-e20.

Performance

Model Test WER Val WER
Baseline (parakeet-tdt-0.6b-v3) 26.17% -
This model (epoch 14, best val) 11.31% 11.55%
Last epoch (epoch 20) 11.24% 11.59%

14.86% absolute WER improvement over the English-only baseline on German medical speech.

Training Details

  • Base model: nvidia/parakeet-tdt-0.6b-v3 (627M params, EncDecRNNTBPEModel)
  • Dataset: Mediform/medical_asr_de (14,388 train / 799 val / 799 test samples, 117h)
  • Hardware: 4x NVIDIA A40 (48GB)
  • Strategy: DDP, bf16-mixed precision
  • Batch size: 2/GPU x 4 GPUs x 8 accumulation = 64 effective
  • Optimizer: AdamW (lr=5e-5, cosine annealing, 200 warmup steps)
  • Epochs: 20 (best at epoch 14)
  • Spec augment: freq_masks=2, time_masks=10

Training Curve

Epoch Val WER
0 16.56%
1 13.21%
2 13.01%
3 12.80%
4 11.95%
8 11.82%
14 11.55%
19 11.59%

Usage with MLX (Apple Silicon)

Weights are in SafeTensors format, compatible with mlx-audio-swift for on-device inference.

Medical Domain Coverage

The training dataset covers 529 unique German medical terms including drug names, conditions, procedures, symptoms, and anatomy. Sources: VoxPopuli (EU Parliament medical debates), MultiMed (patient-doctor dialogues), CommonVoice, Spoken Wikipedia, M-AILABS, TuDa, Voxforge.

Downloads last month
146
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Mediform/parakeet-medical-de

Finetuned
(36)
this model