swecha-gonthuka-asr (ONNX)

This is an ONNX version of viswamaicoe/swecha-gonthuka-asr. It was automatically converted and uploaded using this Hugging Face Space.

Usage with Transformers.js

See the pipeline documentation for automatic-speech-recognition: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.AutomaticSpeechRecognitionPipeline

Swecha Gonthuka ASR (Telugu)

Telugu automatic speech recognition model (wav2vec2-based), trained on the Swecha Gonthuka dataset. It is evaluated on Telugu-only test sets with Character Error Rate (CER).

Model details

Training data: Swecha Gonthuka dataset
Language: Telugu (te)
Metric: CER (Character Error Rate) — text normalized to Telugu script + spaces before scoring.

Evaluation results

Dataset	Test samples	CER (%)
FLEURS (te_in)	304	6.32
OpenSLR66	420	9.00
Common Voice 22 (te)	58	11.92

Note: For evaluation we used only those samples that contain no English words-Telugu text only-for each dataset, to allow a fair evaluation of model capability.

Usage

Python (Transformers)

pip install transformers torch librosa

from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="viswamaicoe/swecha-gonthuka-asr",
    feature_extractor="viswamaicoe/swecha-gonthuka-asr",
)

# From file (16 kHz mono WAV preferred)
text = pipe("audio.wav")
print(text)  # {"text": "..."}

Responsible and ethical use

Intended use: This model is intended for Telugu automatic speech recognition in applications such as transcription, accessibility, and language preservation. Use it in accordance with applicable laws and platform policies.
Limitations: Performance may vary with accent, dialect, noise, and recording quality. Do not rely on it as the sole source for critical or legal transcriptions without human review.
Misuse: Do not use this model to transcribe private conversations without consent, to create misleading or harmful content, or for any purpose that violates privacy, consent, or local regulations.
Bias and fairness: As with any ASR system, outputs can reflect biases present in training data. Evaluate outputs in context and consider human review for high-stakes use cases.

Citation

If you use this model in your work, please cite the Swecha Gonthuka dataset and this model:

@misc{swecha-gonthuka-asr,
  title        = {Swecha Gonthuka ASR: Telugu Speech Recognition},
  author       = {Viswam AI COE},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/viswamaicoe/swecha-gonthuka-asr}},
  note         = {Trained on Swecha Gonthuka dataset; wav2vec2-based Telugu ASR}
}

Downloads last month: 45

Model tree for therajasekhar/swecha-gonthuka-asr-ONNX

Base model

Harveenchadha/wav2vec2-pretrained-clsril-23-10k

Finetuned

viswamaicoe/swecha-gonthuka-asr

Quantized

(1)

this model