Spaces:
Running
Running
| title: Mega-ASR — pure browser ASR | |
| emoji: 🎙️ | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: static | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Robust in-the-wild ASR running entirely in the browser | |
| models: | |
| - zhifeixie/Mega-ASR | |
| - Reza2kn/mega-asr-onnx | |
| datasets: | |
| - xzf-thu/Voices-in-the-Wild-Bench | |
| tags: | |
| - automatic-speech-recognition | |
| - robust-asr | |
| - mega-asr | |
| - onnxruntime-web | |
| - webgpu | |
| - browser | |
| - benchmark | |
| - wer | |
| # Mega-ASR — pure browser ASR | |
| Live demo of [Mega-ASR](https://huggingface.co/zhifeixie/Mega-ASR) (1.7B-param | |
| robust multilingual ASR) running **entirely in your browser** via | |
| `onnxruntime-web` and WebGPU. No server-side inference — your audio never | |
| leaves the device. | |
| The INT4 ONNX deployment artifacts (~2 GB total: audio encoder + decoder | |
| prefill + decoder step + INT8 embedding table) ship at | |
| [Reza2kn/mega-asr-onnx](https://huggingface.co/Reza2kn/mega-asr-onnx) and are | |
| downloaded on the first visit, then cached by the browser for subsequent runs. | |
| Pre-loaded examples come from | |
| [Voices-in-the-Wild-Bench](https://github.com/xzf-thu/Voices-in-the-Wild-Bench) | |
| — eight noisy clips covering noise, far-field speech, obstruction, distortion, | |
| recording artifacts, echo, dropout, and a mixed condition. Each example ships | |
| with its reference transcript so the agreement score is computed automatically. | |
| **Agreement bands** (word-level, 1 - WER): | |
| - 🟢 ≥70 % | |
| - 🟠 50-70 % | |
| - 🟡 25-50 % | |
| - 🔴 <25 % | |