CDLI SLAM-ASR English Atypical Speech MEUSLI v1 Projector-Only Checkpoint (Epoch 3 Step 208)

Projector-only atypical-speech adaptation checkpoint for SLAM-ASR on the CDLI Ugandan English atypical speech dataset. This run starts from the SpeechTek multilingual speech LLM linear projector v1 stack and fine-tunes only the linear projector while keeping the Whisper encoder and EuroLLM decoder frozen.

What this repository contains

This Hub repository stores a partial SLAM-ASR checkpoint for use with the SLAM-LLM codebase. It is not a standalone transformers checkpoint.

Checkpoint type: projector_only
Architecture: Whisper-large-v3-turbo encoder + linear projector (mEUltilingual_speechllm_linear_projector_v1 initialization) + EuroLLM-1.7B decoder; encoder frozen; LLM frozen; no prompt; no PEFT adapters during training or decode.
Base encoder: openai/whisper-large-v3-turbo
Base LLM: utter-project/EuroLLM-1.7B
Exported files: model.pt

Training / evaluation context

Dataset: cdli/ugandan_english_nonstandard_speech_v1.0
Evaluation split: test
Training speakers: 36
Validation speakers: 5
Speaker overlap: No speaker overlap between train and validation/test

Reported metrics

Normalized WER (JiWER scorer): 29.38%
Normalized CER (JiWER scorer): 19.79%
Atypical overall normalized WER: 29.67%
Atypical overall normalized CER: 19.81%
Atypical averaged utterance WER: 28.05%
Atypical averaged utterance CER: 18.80%

Decode settings used for the reported metrics

Decode used the English MEUSLI no-prompt configuration with MAX_NEW_TOKENS=200, NUM_BEAMS=4, REPETITION_PENALTY=2.0, NO_REPEAT_NGRAM_SIZE=2, and USE_LLM_PEFT=false.

Additional results notes

This checkpoint improved the English zero-shot MEUSLI v1 baseline from 41.59% to 28.05% averaged utterance WER. Best speakers were UG014 (14.36%), UG021 (20.06%), and UG042 (24.16%). Hardest groups remained voice disorder (39.17%) and acquired hearing impairment (39.17%).

Loading notes

Load through SLAM-LLM with the exact English MEUSLI/OpenAI stack: openai/whisper-large-v3-turbo encoder, utter-project/EuroLLM-1.7B decoder, and no prompt template. This repository stores a partial SLAM-ASR checkpoint, not a standalone Transformers model.

Typical decode flow in this project uses project-specific wrappers such as:

examples/asr_luganda/scripts/decode_luganda_sunflower.sh for the Sunflower/Luganda stack
examples/asr_luganda/scripts/decode_english_meusli_openai.sh for the English MEUSLI/OpenAI stack
matching PEFT settings at decode time when adapters are part of the checkpoint

Caveats

This repository stores SLAM-ASR training artifacts intended for research use.
The checkpoint must be used with the matching SLAM-LLM model code and base components.
Results can be sensitive to decode settings and evaluation protocol.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for KasuleTrevor/cdli-slam-asr-english-atypical-meusli-v1-projector-only-e3s208

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Adapter

(120)

this model

KasuleTrevor
/

cdli-slam-asr-english-atypical-meusli-v1-projector-only-e3s208