Uzbek STT Parakeet TDT 0.6B
The first open-source Parakeet-based Uzbek automatic speech recognition model.
Model Details
| Property | Value |
|---|---|
| Architecture | FastConformer encoder + TDT decoder |
| Parameters | 609M |
| Base Model | nvidia/parakeet-tdt-0.6b-v3 |
| License | CC-BY-4.0 |
Usage
Basic Usage (TDT Decoder)
import nemo.collections.asr as nemo_asr
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="zafarrr/uzbek-stt-parakeet-ctc-0.6b",
filename="parakeet_tdt_uzbek.nemo"
)
model = nemo_asr.models.ASRModel.restore_from(model_path)
model.eval()
transcriptions = model.transcribe(["audio.wav"])
print(transcriptions[0])
Alternative: CTC Decoder
from omegaconf import OmegaConf
model.change_decoding_strategy(
decoder_type="ctc",
decoding_cfg=OmegaConf.create({"strategy": "greedy_batch"}),
)
transcriptions = model.transcribe(["audio.wav"])
Limitations
- Performance may degrade on noisy telephone audio
- Uzbek language only
Attribution
Based on nvidia/parakeet-tdt-0.6b-v3 by NVIDIA, licensed under CC-BY-4.0.
- Downloads last month
- 2
Model tree for zafarrr/uzbek-stt-parakeet-ctc-0.6b
Base model
nvidia/parakeet-tdt-0.6b-v3