You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

CDLI Qwen3-ASR 1.7B English Fine-Tune

This repository contains a Qwen3-ASR model fine-tuned from Qwen/Qwen3-ASR-1.7B on the gated cdli/ugandan_english_nonstandard_speech_v1.0 dataset.

The task is English automatic speech recognition for atypical or non-standard speech from Ugandan speakers, including dysarthric speech.

Model Details

Base model: Qwen/Qwen3-ASR-1.7B
Fine-tuning framework: Qwen3-ASR SFT with PEFT/LoRA
Language: English
Checkpoint reported: checkpoint-3882

Dataset

Dataset: cdli/ugandan_english_nonstandard_speech_v1.0
License: cc-by-sa-4.0 (dataset card)
Split sizes used by the source dataset card:
- train: 5176
- validation: 638
- test: 1017

Evaluation artifacts in this run contain 1013 scored rows and 1013 valid normalized references.

Training Configuration

Work root: /jupyter_kernel/qwen3_asr_cdli_en
Base checkpoint: Qwen/Qwen3-ASR-1.7B
Max manifest audio length: 30.0 s
Max training audio length: 30.0 s
Min audio length: 0.2 s
Train batch size: 4
Gradient accumulation steps: 2
Effective train batch size: 8
Learning rate: 2e-5
Scheduler: cosine

Evaluation

Evaluation was run on the held-out test split using both raw transcript comparison and normalized transcript comparison.

Corpus Metrics

Raw WER: 29.55%
Raw CER: 16.15%
Normalized WER: 24.06%
Normalized CER: 14.93%

Average Utterance Metrics

Average normalized utterance WER (capped at 1.0): 22.72%
Average normalized utterance CER (capped at 1.0): 14.35%

Usage

from qwen_asr import Qwen3ASRModel

model = Qwen3ASRModel.from_pretrained('KasuleTrevor/cdli-qwen3-asr-en-finetune-1')
predictions = model.transcribe(['path/to/audio.wav'])
print(predictions[0].text if hasattr(predictions[0], 'text') else predictions[0])

Files

model.safetensors: fine-tuned model weights
tokenizer.json, tokenizer_config.json, preprocessor_config.json, chat_template.json
results/checkpoint-3882/test_predictions.csv
results/checkpoint-3882/test_predictions.jsonl
results/checkpoint-3882/test_predictions_scored.csv
results/checkpoint-3882/test_predictions_scored.jsonl
results/checkpoint-3882/test_predictions_grouped_analysis.csv

Notes

Results are stored in a checkpoint-scoped folder under results/.
Normalized metrics use transcript normalization to reduce punctuation, casing, and formatting noise during evaluation.
Access to the source dataset is gated. Review dataset terms before requesting access.

Downloads last month: 101

Safetensors

Model size

2B params

Tensor type

BF16

·

Dataset used to train KasuleTrevor/cdli-qwen3-asr-en-finetune-1

Evaluation results

Test WER (raw) on CDLI Ugandan English Non-Standard Speech v1.0
test set self-reported

29.547
Test CER (raw) on CDLI Ugandan English Non-Standard Speech v1.0
test set self-reported

16.150
Test WER (normalized) on CDLI Ugandan English Non-Standard Speech v1.0
test set self-reported

24.060
Test CER (normalized) on CDLI Ugandan English Non-Standard Speech v1.0
test set self-reported

14.934