Karakalpak ASR
Collection
The collection of the Fine tuned Karakalpak models β’ 6 items β’ Updated β’ 1
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m for Automatic Speech Recognition (ASR) in the Karakalpak language.
Quyashbek Allanazarov
Evaluation was performed on a held-out test set.
| Metric | Score |
|---|---|
| WER (Word Error Rate) | 21.21% |
| CER (Character Error Rate) | 4.34% |
facebook/wav2vec2-xls-r-300mimport torch
import librosa
from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
model_id = "Quyashbek/wav2vec2-xls-r-300m-karakalpak-asr"
processor = Wav2Vec2Processor.from_pretrained(model_id)
model = Wav2Vec2ForCTC.from_pretrained(model_id)
speech, sr = librosa.load("your_audio.wav", sr=16000)
inputs = processor(speech, sampling_rate=16000, return_tensors="pt", padding=True)
with torch.no_grad():
logits = model(inputs.input_values).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Base model
facebook/wav2vec2-xls-r-300m