You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

CDLI Qwen3-ASR 1.7B English Fine-Tune

Model architecture Base model Language

This repository contains a Qwen3-ASR model fine-tuned from Qwen/Qwen3-ASR-1.7B on the gated cdli/ugandan_english_nonstandard_speech_v1.0 dataset.

The task is English automatic speech recognition for atypical or non-standard speech from Ugandan speakers, including dysarthric speech.

Model Details

  • Base model: Qwen/Qwen3-ASR-1.7B
  • Fine-tuning framework: Qwen3-ASR SFT with PEFT/LoRA
  • Language: English
  • Checkpoint reported: checkpoint-3882

Dataset

  • Dataset: cdli/ugandan_english_nonstandard_speech_v1.0
  • License: cc-by-sa-4.0 (dataset card)
  • Split sizes used by the source dataset card:
    • train: 5176
    • validation: 638
    • test: 1017

Evaluation artifacts in this run contain 1013 scored rows and 1013 valid normalized references.

Training Configuration

  • Work root: /jupyter_kernel/qwen3_asr_cdli_en
  • Base checkpoint: Qwen/Qwen3-ASR-1.7B
  • Max manifest audio length: 30.0 s
  • Max training audio length: 30.0 s
  • Min audio length: 0.2 s
  • Train batch size: 4
  • Gradient accumulation steps: 2
  • Effective train batch size: 8
  • Learning rate: 2e-5
  • Scheduler: cosine

Evaluation

Evaluation was run on the held-out test split using both raw transcript comparison and normalized transcript comparison.

Corpus Metrics

  • Raw WER: 29.55%
  • Raw CER: 16.15%
  • Normalized WER: 24.06%
  • Normalized CER: 14.93%

Average Utterance Metrics

  • Average normalized utterance WER (capped at 1.0): 22.72%
  • Average normalized utterance CER (capped at 1.0): 14.35%

Usage

from qwen_asr import Qwen3ASRModel

model = Qwen3ASRModel.from_pretrained('KasuleTrevor/cdli-qwen3-asr-en-finetune-1')
predictions = model.transcribe(['path/to/audio.wav'])
print(predictions[0].text if hasattr(predictions[0], 'text') else predictions[0])

Files

  • model.safetensors: fine-tuned model weights
  • tokenizer.json, tokenizer_config.json, preprocessor_config.json, chat_template.json
  • results/checkpoint-3882/test_predictions.csv
  • results/checkpoint-3882/test_predictions.jsonl
  • results/checkpoint-3882/test_predictions_scored.csv
  • results/checkpoint-3882/test_predictions_scored.jsonl
  • results/checkpoint-3882/test_predictions_grouped_analysis.csv

Notes

  • Results are stored in a checkpoint-scoped folder under results/.
  • Normalized metrics use transcript normalization to reduce punctuation, casing, and formatting noise during evaluation.
  • Access to the source dataset is gated. Review dataset terms before requesting access.
Downloads last month
101
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train KasuleTrevor/cdli-qwen3-asr-en-finetune-1

Evaluation results