Uzbek STT Parakeet TDT 0.6B

The first open-source Parakeet-based Uzbek automatic speech recognition model.

Model Details

Property Value
Architecture FastConformer encoder + TDT decoder
Parameters 609M
Base Model nvidia/parakeet-tdt-0.6b-v3
License CC-BY-4.0

Usage

Basic Usage (TDT Decoder)

import nemo.collections.asr as nemo_asr
from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="zafarrr/uzbek-stt-parakeet-ctc-0.6b",
    filename="parakeet_tdt_uzbek.nemo"
)

model = nemo_asr.models.ASRModel.restore_from(model_path)
model.eval()
transcriptions = model.transcribe(["audio.wav"])
print(transcriptions[0])

Alternative: CTC Decoder

from omegaconf import OmegaConf

model.change_decoding_strategy(
    decoder_type="ctc",
    decoding_cfg=OmegaConf.create({"strategy": "greedy_batch"}),
)
transcriptions = model.transcribe(["audio.wav"])

Limitations

  • Performance may degrade on noisy telephone audio
  • Uzbek language only

Attribution

Based on nvidia/parakeet-tdt-0.6b-v3 by NVIDIA, licensed under CC-BY-4.0.

Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for zafarrr/uzbek-stt-parakeet-ctc-0.6b

Finetuned
(35)
this model

Space using zafarrr/uzbek-stt-parakeet-ctc-0.6b 1