Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

Whisper Large v3 LoRA for Karakalpak ASR

This repository contains a LoRA adapter fine-tuned on top of openai/whisper-large-v3 for automatic speech recognition (ASR) in Karakalpak.

The model is intended for transcribing Karakalpak speech from audio into text. This repository contains the PEFT/LoRA adapter weights, not the full base model weights.

Model Details

Model Description

This is a parameter-efficient fine-tuned Whisper Large v3 model for Karakalpak speech transcription. It was trained using LoRA on top of the pretrained Whisper encoder-decoder model.

  • Developed by: Quyashbek
  • Funded by: Self-directed / internal project
  • Shared by: Quyashbek
  • Model type: Automatic Speech Recognition (ASR), Whisper LoRA adapter
  • Language(s): Karakalpak
  • License: Apache-2.0
  • Finetuned from model: openai/whisper-large-v3

Model Sources

  • Repository: TODO
  • Base model: openai/whisper-large-v3

Uses

Direct Use

This model is intended for:

  • Karakalpak speech transcription
  • research and experimentation in low-resource ASR
  • evaluating Whisper adaptation to Karakalpak speech

Because this repository contains a LoRA adapter, it should be loaded together with the original Whisper base model.

Downstream Use

Possible downstream uses include:

  • subtitle generation
  • speech-to-text preprocessing
  • transcription pipelines for Karakalpak audio archives
  • ASR benchmarking for Karakalpak speech datasets

Out-of-Scope Use

This model is not intended for:

  • high-stakes transcription where errors may cause harm
  • speaker identification
  • emotion recognition
  • reliable multilingual transcription outside Karakalpak-focused usage
  • heavily noisy, far-field, overlapping-speaker audio without further adaptation

Bias, Risks, and Limitations

This model may perform unevenly depending on:

  • speaker accent and dialect variation
  • recording quality
  • background noise
  • speaking speed
  • domain mismatch between training and test audio

Since Karakalpak is a relatively low-resource language setting, the model may underperform on speech styles or vocabulary not well represented in the training data.

The model may also hallucinate, truncate long audio, or repeat text if used without proper long-form chunking.

Recommendations

Users should:

  • validate transcriptions before production use
  • use chunked inference for long audio
  • test on their own domain data before deployment
  • avoid relying on this model alone for sensitive or high-risk applications

How to Get Started with the Model

This repository contains a LoRA adapter. Load it with the Whisper Large v3 base model.

import os
os.environ["OPENBLAS_NUM_THREADS"] = "1"
os.environ["OMP_NUM_THREADS"] = "1"
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["NUMEXPR_NUM_THREADS"] = "1"

import torch
import librosa
from peft import PeftModel
from transformers import WhisperProcessor, WhisperForConditionalGeneration

BASE_MODEL = "openai/whisper-large-v3"
ADAPTER_MODEL = "Quyashbek/whisper-large-v3-lora-karakalpak"  # change if needed
AUDIO_PATH = "sample.wav"

device = "cuda" if torch.cuda.is_available() else "cpu"

processor = WhisperProcessor.from_pretrained(BASE_MODEL, task="transcribe")
base_model = WhisperForConditionalGeneration.from_pretrained(BASE_MODEL)
model = PeftModel.from_pretrained(base_model, ADAPTER_MODEL).to(device)
model.eval()

audio, sr = librosa.load(AUDIO_PATH, sr=16000, mono=True)
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
input_features = inputs.input_features.to(device)

with torch.no_grad():
    predicted_ids = model.generate(
        input_features,
        max_new_tokens=225,
    )

text = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(text)
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Quyashbek/whisper-large-v3-lora-karakalpak

Adapter
(197)
this model

Collection including Quyashbek/whisper-large-v3-lora-karakalpak