CDLI SLAM-ASR English Atypical Speech MEUSLI v1 Projector-Only Checkpoint (Epoch 3 Step 208)
Projector-only atypical-speech adaptation checkpoint for SLAM-ASR on the CDLI Ugandan English atypical speech dataset. This run starts from the SpeechTek multilingual speech LLM linear projector v1 stack and fine-tunes only the linear projector while keeping the Whisper encoder and EuroLLM decoder frozen.
What this repository contains
This Hub repository stores a partial SLAM-ASR checkpoint for use with the
SLAM-LLM codebase. It is not a standalone transformers checkpoint.
- Checkpoint type:
projector_only - Architecture: Whisper-large-v3-turbo encoder + linear projector (mEUltilingual_speechllm_linear_projector_v1 initialization) + EuroLLM-1.7B decoder; encoder frozen; LLM frozen; no prompt; no PEFT adapters during training or decode.
- Base encoder:
openai/whisper-large-v3-turbo - Base LLM:
utter-project/EuroLLM-1.7B - Exported files:
model.pt
Training / evaluation context
- Dataset:
cdli/ugandan_english_nonstandard_speech_v1.0 - Evaluation split:
test - Training speakers: 36
- Validation speakers: 5
- Speaker overlap: No speaker overlap between train and validation/test
Reported metrics
- Normalized WER (JiWER scorer): 29.38%
- Normalized CER (JiWER scorer): 19.79%
- Atypical overall normalized WER: 29.67%
- Atypical overall normalized CER: 19.81%
- Atypical averaged utterance WER: 28.05%
- Atypical averaged utterance CER: 18.80%
Decode settings used for the reported metrics
Decode used the English MEUSLI no-prompt configuration with MAX_NEW_TOKENS=200, NUM_BEAMS=4, REPETITION_PENALTY=2.0, NO_REPEAT_NGRAM_SIZE=2, and USE_LLM_PEFT=false.
Additional results notes
This checkpoint improved the English zero-shot MEUSLI v1 baseline from 41.59% to 28.05% averaged utterance WER. Best speakers were UG014 (14.36%), UG021 (20.06%), and UG042 (24.16%). Hardest groups remained voice disorder (39.17%) and acquired hearing impairment (39.17%).
Loading notes
Load through SLAM-LLM with the exact English MEUSLI/OpenAI stack: openai/whisper-large-v3-turbo encoder, utter-project/EuroLLM-1.7B decoder, and no prompt template. This repository stores a partial SLAM-ASR checkpoint, not a standalone Transformers model.
Typical decode flow in this project uses project-specific wrappers such as:
examples/asr_luganda/scripts/decode_luganda_sunflower.shfor the Sunflower/Luganda stackexamples/asr_luganda/scripts/decode_english_meusli_openai.shfor the English MEUSLI/OpenAI stack- matching PEFT settings at decode time when adapters are part of the checkpoint
Caveats
- This repository stores SLAM-ASR training artifacts intended for research use.
- The checkpoint must be used with the matching SLAM-LLM model code and base components.
- Results can be sensitive to decode settings and evaluation protocol.