whisper-large-v3-DASS2019-ct2

Model Description

This repository contains a CTranslate2-converted version of a Whisper large-v3 model fine-tuned on DASS2019 speech data.

The model is intended for inference using:

This format is optimized for:

  • fast GPU inference
  • reduced memory footprint
  • production ASR pipelines

Base Model

Fine-tuned from:

openai/whisper-large-v3

Converted to CTranslate2 format for inference acceleration.


Intended Use

This model is designed for:

  • Automatic speech recognition (ASR) of historical Southern American English and African-American English
  • Research transcription pipelines
  • Large-scale batch transcription
  • WhisperX alignment / diarization workflows

Out-of-Scope Use

This model is not suitable for:

  • Real-time low-latency streaming without additional engineering
  • Use via Hugging Face Transformers API
  • Applications requiring multilingual robustness (trained primarily on English speech)

How to Use

from huggingface_hub import snapshot_download
from faster_whisper import WhisperModel

model_dir = snapshot_download("stcoats/whisper-large-v3-DASS2019-ct2")
model = WhisperModel(model_dir, device="cuda", compute_type="float16")

segments, info = model.transcribe("audio.wav", language="en", beam_size=5)

for s in segments:
    print(f"[{s.start:.2f}-{s.end:.2f}] {s.text}")

or

from huggingface_hub import snapshot_download
import whisperx

model_dir = snapshot_download("stcoats/whisper-large-v3-DASS2019-ct2")
model = whisperx.load_model(model_dir, device="cuda", compute_type="float16")

audio = whisperx.load_audio("audio.wav")
result = model.transcribe(audio, language="en", vad_filter=False)

for s in result["segments"]:
    print(f"[{s['start']:.2f}-{s['end']:.2f}] {s['text']}")

Details

Training Data

DASS2019_NLP

For details, please see Coats, Steven. (forthcoming). A Fine-tuned ASR Model for Historical American Dialect Recordings. Proceedings of LREC 2026.

Citation

BibTeX:

@inproceedings{coats26_lrec,
  title     = {{A Fine-tuned ASR Model for Historical American Dialect Recordings}},
  author    = {Steven Coats},
  year      = {},
  booktitle = {Proceedings of LREC 2026},
  pages     = {},
  doi       = {},
}

APA:

Coats, Steven. (forthcoming). A Fine-tuned ASR Model for Historical American Dialect Recordings. Proceedings of LREC 2026.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for stcoats/whisper-large-v3-DASS2019-ct2

Finetuned
(813)
this model

Dataset used to train stcoats/whisper-large-v3-DASS2019-ct2