sneakyfree's picture
clone from WindyWord/listen-windy-lingua-mn via HF->HF (ADR-039 Phase C listen-*)
dfdb29d verified
metadata
license: apache-2.0
tags:
  - automatic-speech-recognition
  - whisper
  - windyword
  - mongolian
  - mn
library_name: transformers
pipeline_tag: automatic-speech-recognition
language:
  - mn

WindyWord.ai STT — Mongolian Lingua (GPU (safetensors))

Transcribes Mongolian speech (Mongolic).

Note: Replaces a previous community Mongolian fine-tune that audited as broken (index error on first sample). Now derived from Ganaa0614/whisper-small-mongolian-ver_0.1 (top community Mongolian Whisper by downloads). Verified post-upload at WER 57.7% / CER 17.3% / script-match 100% on 20-sample FLEURS mn_mn — MARGINAL tier (functional, with hesitation on rare vocabulary). Tokenizer/preprocessor files filled in from openai/whisper-small since the upstream fine-tune omits them.

Quality

  • FLEURS WER: 100.0% (50-sample audit)
  • CER: 1.0
  • Tier: UNUSABLE-GAP ⭐
  • Source: WindyWord Grand Rounds v2 audit (50-sample FLEURS)

About this variant

This is the safetensors deployment format of our Mongolian Lingua STT model. Load it via the safetensors/ subfolder.

Part of the WindyWord.ai STT fleet — covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor
processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-mn", subfolder="safetensors")
model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-mn", subfolder="safetensors")

Commercial Use

Visit windyword.ai for apps and API access.


Provenance & License

Weights derived from upstream community Whisper fine-tunes (see individual model card for exact lineage). Redistributed under Apache-2.0 (inherited).

Certified by Opus 4.6 Opus-Claw (Dr. C) on Veron-1 (RTX 5090, Mt Pleasant SC).