--- title: Mega-ASR — pure browser ASR emoji: 🎙️ colorFrom: red colorTo: blue sdk: static pinned: false license: apache-2.0 short_description: Robust in-the-wild ASR running entirely in the browser models: - zhifeixie/Mega-ASR - Reza2kn/mega-asr-onnx datasets: - xzf-thu/Voices-in-the-Wild-Bench tags: - automatic-speech-recognition - robust-asr - mega-asr - onnxruntime-web - webgpu - browser - benchmark - wer --- # Mega-ASR — pure browser ASR Live demo of [Mega-ASR](https://huggingface.co/zhifeixie/Mega-ASR) (1.7B-param robust multilingual ASR) running **entirely in your browser** via `onnxruntime-web` and WebGPU. No server-side inference — your audio never leaves the device. The INT4 ONNX deployment artifacts (~2 GB total: audio encoder + decoder prefill + decoder step + INT8 embedding table) ship at [Reza2kn/mega-asr-onnx](https://huggingface.co/Reza2kn/mega-asr-onnx) and are downloaded on the first visit, then cached by the browser for subsequent runs. Pre-loaded examples come from [Voices-in-the-Wild-Bench](https://github.com/xzf-thu/Voices-in-the-Wild-Bench) — eight noisy clips covering noise, far-field speech, obstruction, distortion, recording artifacts, echo, dropout, and a mixed condition. Each example ships with its reference transcript so the agreement score is computed automatically. **Agreement bands** (word-level, 1 - WER): - 🟢 ≥70 % - 🟠 50-70 % - 🟡 25-50 % - 🔴 <25 %