Whisper Small Uyghur LoRA (Fine-tuned)

چۈشەندۈرۈشى (Description in Uyghur)

بۇ مودېل openai/whisper-small ئاساسىدا ئۇيغۇرچە ئاۋازنى تونۇش ئۈچۈن مەخسۇس تەربىيەلەنگەن. بىز LoRA تېخنىكىسىنى ئىشلىتىپ، ئۇيغۇرچە ئاۋازلارنى يۇقىرى ئېنىقلىقتا تېكىستكە ئايلاندۇرۇش مەقسىتىگە يەتتۇق.

تەربىيەلەش سانلىق مەلۇماتى: Mozilla Common Voice (Uyghur)
قاتتىق دېتال: NVIDIA GeForce RTX 3060 (9 سائەت تەربىيەلەنگەن)

Model Description (English)

This model is a fine-tuned version of OpenAI Whisper Small for Uyghur Speech Recognition (ASR). It was trained using LoRA (Low-Rank Adaptation), resulting in a lightweight but highly accurate adapter (approx. 13MB).

Data Source: Mozilla Common Voice / Data Collective
Hardware: Trained on a single NVIDIA RTX 3060 GPU for approximately 9 hours.

⚙️ Training Details

Base Model: openai/whisper-small
Method: PEFT (LoRA)
Training Time: ~9 hours
Optimizer: AdamW
Adapter Size: ~13.5 MB

⚠️ Disclaimer (ئاگاھلاندۇرۇش)

English: This model is released for research, educational, and language preservation purposes only. The developer strongly opposes the use of this technology for mass surveillance, human rights violations, or any form of discrimination.

ئۇيغۇرچە: بۇ مودېل پەقەت تەتقىقات، مائارىپ ۋە تىلنى قوغداش مەقسىتىدە ئېلان قىلىندى. بۇ تېخنىكىنى كۆزىتىش، كىشىلىك ھوقۇققا دەخلى-تەرۇز قىلىش ياكى كەمسىتىش خاراكتېرلىك ئىشلارغا ئىشلىتىشكە قەتئىي قارشى تۇرىمىز.

How to use

Option 1: Using Transformers & PEFT (lora)

You can load this model using PEFT and Transformers. Since the processor is not included in this adapter-only repo, please load the processor from the base model.

import torch
import librosa
from transformers import WhisperForConditionalGeneration, WhisperProcessor
from peft import PeftModel

# 1. Setup Model IDs
base_model_id = "openai/whisper-small"
peft_model_id = "xiwol/whisper-small-uyghur"

# 2. Load Processor from the base model
# Note: We specify language and task for Uyghur ASR
processor = WhisperProcessor.from_pretrained(base_model_id, language="uyghur", task="transcribe")

# 3. Load Base Model
base_model = WhisperForConditionalGeneration.from_pretrained(
    base_model_id, 
    device_map="auto", 
    torch_dtype=torch.float16
)

Option 2: Full Merged Model / بىرىكتۈرۈلگەن تولۇق مودېل

The model files in the whisper-small-uyghur-merged folder are the full, standalone versions (Base model + PEFT adapters merged). You can use this folder directly with transformers or other inference tools.

«whisper-small-uyghur-merged» قىسقۇچىدىكى مودېل بىرىكتۈرۈلگەن تولۇق مودېلدۇر . سىز بۇ قىسقۇچتىكى مودېلنى بىۋاسىتە ئىشلىتەلەيسىز.

import torch
import librosa
from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model=r"./whisper-small-uyghur-merged",
    device="cuda" if torch.cuda.is_available() else "cpu"
)

audio_path = r"./common_voice_ug_38346884.mp3"
audio_array, _ = librosa.load(audio_path, sr=16000)

result = pipe(audio_array, generate_kwargs={"language": "uzbek"}) 
print(f"\nout: {result['text']}")

Option 3: Using GGML / Whisper.cpp (Desktop & Lightweight)

For convenience, I have converted the full model into GGML format (compatible with whisper.cpp). You can directly download the pre-built model files and use them via desktop applications without setting up a Python environment.

ئىشلىتىشكە قولايلىق بولسۇن ئۈچۈن، مەن پۈتۈن مودېلنى GGML مودېلىغا (whisper.cpp) ئايلاندۇرۇپ بولدۇم. سىز ggml--small-model.bin ياكى ggml-base-model.bin مودېلىنى چۈشۈرسىڭىز بولىدۇ، تۆۋەندىكى قوراللارنىڭ بىرىنى تاللاپ مودېلنى ئەكىرسىڭىزلا ئىشلىتەلەيسىز:

1. Download Models

You can download the following model files based on your needs:

ggml--small-model.bin: Recommended, better accuracy.
ggml-base-model.bin: Faster performance, lower resource usage.

2. Recommended Tools

After downloading the model, you can import it into any of the following open-source tools:

WhisperDesktop (Windows): https://github.com/gtppoplp/WhisperDesktop
Vibe (Cross-platform): https://github.com/thewh1teagle/vibe

How to Import and Use

Open the "Models Folder" and place the model files ggml--small-model.bin and ggml-base-model.bin inside:
Select your preferred model to start. Note: Please select "Uzbek"，“Turkish” as the language for Uyghur transcription (due to tool compatibility):

ئۇيغۇرچە يېتەكچى (Uyghur Guide)

1. «Models Folder»نى ئېچىپ، ggml--small-model.bin ۋە ggml-base-model.bin مودېللىرىنى شۇ يەرگە قويۇڭ.
2. ئاندىن بىر مودېلنى تاللاپ ئىشلەتسىڭىز بولىدۇ. تىل تاللاش تىزىملىكىدىن ，«Turkish»，«Uzbek»نى تاللاڭ (ئۇيغۇرچە يېزىقنى قوللاش ئۈچۈن).

2. Select Language

After loading the model, please select "Turkish" as the transcription language for the best results with Uyghur audio (due to better token compatibility in this tool).

**ئەسكەرتىش:** تىل تاللاشتا **「Turkish」** (تۈركچە) نى تاللىسىڭىز، ئۇيغۇرچە تەلەپپۇزنى تونۇش ئۈنۈمى ئەڭ ياخشى بولىدۇ.

Downloads last month: 163

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Rekipjan/whisper-small-uyghur

Base model

openai/whisper-small

Adapter

(232)

this model