CDLI SLAM-ASR Luganda Atypical Speech Zero-Shot ASR-Adapted Baseline
Zero-shot transfer baseline on the CDLI Luganda atypical speech dataset using the ASR-adapted SLAM-ASR checkpoint before any atypical-speech adaptation. This repository is provided to support controlled comparison against projector-only and encoder-LoRA adaptation.
What this repository contains
This Hub repository stores a partial SLAM-ASR checkpoint for use with the
SLAM-LLM codebase. It is not a standalone transformers checkpoint.
- Checkpoint type:
baseline_checkpoint - Architecture: Whisper encoder (Sunbird/asr-whisper-large-v3-salt) + linear projector + Sunflower-14B decoder; no atypical-speech adaptation; no PEFT adapters active at decode time.
- Base encoder:
Sunbird/asr-whisper-large-v3-salt - Base LLM:
Sunbird/Sunflower-14B - Exported files:
model.pt
Training / evaluation context
- Dataset:
cdli/ugandan_luganda_nonstandard_speech_v1.0 - Evaluation split:
test - Training speakers: 36
- Validation speakers: 5
- Speaker overlap: No speaker overlap between train and validation/test
Reported metrics
- Normalized WER (JiWER scorer): 72.34%
- Normalized CER (JiWER scorer): 42.98%
- Atypical overall normalized WER: 75.52%
- Atypical overall normalized CER: 43.31%
- Atypical averaged utterance WER: 68.64%
- Atypical averaged utterance CER: 34.64%
Decode settings used for the reported metrics
Test decode used MAX_NEW_TOKENS=200, NUM_BEAMS=4, REPETITION_PENALTY=2.0, NO_REPEAT_NGRAM_SIZE=2, USE_ENCODER_PEFT=false.
Additional results notes
Test subgroup breakdown: Mild 63.20% WER, Moderate 66.56%, Severe 77.25%. By disorder: Dysarthria 61.20%, Articulation Disorders 68.74%, Stuttering 70.23%, Voice disorder 81.35%. Average hyp/ref ratio was 87.85%, indicating frequent under-generation on more difficult speakers.
Loading notes
Load through SLAM-LLM; this repository stores a partial SLAM-ASR checkpoint, not a standalone Transformers model.
Typical decode flow in this project uses:
examples/asr_luganda/scripts/decode_luganda_sunflower.shUSE_ENCODER_PEFT=truefor encoder-LoRA checkpoints- matching LoRA target modules at decode time
Caveats
- This repository stores SLAM-ASR training artifacts intended for research use.
- The checkpoint must be used with the matching SLAM-LLM model code and base components.
- Results can be sensitive to decode settings and evaluation protocol.