CDLI SLAM-ASR Luganda Atypical Speech Projector-Only Checkpoint (Epoch 1 Step 5000)
Projector-only atypical-speech adaptation checkpoint for SLAM-ASR on the CDLI Luganda atypical speech dataset. The encoder and Sunflower-14B decoder remain frozen; only the linear projector is updated from the ASR-adapted starting checkpoint.
What this repository contains
This Hub repository stores a partial SLAM-ASR checkpoint for use with the
SLAM-LLM codebase. It is not a standalone transformers checkpoint.
- Checkpoint type:
projector_only - Architecture: Whisper encoder (Sunbird/asr-whisper-large-v3-salt) + linear projector + Sunflower-14B decoder; encoder frozen; LLM frozen; no PEFT adapters.
- Base encoder:
Sunbird/asr-whisper-large-v3-salt - Base LLM:
Sunbird/Sunflower-14B - Exported files:
model.pt
Training / evaluation context
- Dataset:
cdli/ugandan_luganda_nonstandard_speech_v1.0 - Evaluation split:
validation - Training speakers: 36
- Validation speakers: 5
- Speaker overlap: No speaker overlap between train and validation/test
Reported metrics
- Normalized WER (JiWER scorer): not provided
- Normalized CER (JiWER scorer): not provided
- Atypical overall normalized WER: not provided
- Atypical overall normalized CER: not provided
- Atypical averaged utterance WER: not provided
- Atypical averaged utterance CER: not provided
Decode settings used for the reported metrics
Final decode metrics for this checkpoint are not uploaded yet. This repository is being published as an earlier projector-only research checkpoint for comparison.
Additional results notes
This is the epoch 1 step 5000 checkpoint from the projector-only atypical adaptation run. The later epoch 2 step 107 checkpoint achieved the stronger reported test result and is published separately.
Loading notes
Load through SLAM-LLM; this repository stores a partial SLAM-ASR checkpoint, not a standalone Transformers model.
Typical decode flow in this project uses:
examples/asr_luganda/scripts/decode_luganda_sunflower.shUSE_ENCODER_PEFT=truefor encoder-LoRA checkpoints- matching LoRA target modules at decode time
Caveats
- This repository stores SLAM-ASR training artifacts intended for research use.
- The checkpoint must be used with the matching SLAM-LLM model code and base components.
- Results can be sensitive to decode settings and evaluation protocol.