| --- |
| license: cc-by-4.0 |
| language: |
| - en |
| - es |
| - it |
| - fr |
| - de |
| - nl |
| - ru |
| - pl |
| - uk |
| - sk |
| - bg |
| - fi |
| - ro |
| - hr |
| - cs |
| - sv |
| - et |
| - hu |
| - lt |
| - da |
| - mt |
| - sl |
| - lv |
| - el |
| pipeline_tag: automatic-speech-recognition |
| thumbnail: null |
| tags: |
| - automatic-speech-recognition |
| - speech |
| - audio |
| - Transducer |
| - TDT |
| - FastConformer |
| - Conformer |
| - multilingual |
| - NeMo |
| - OpenVINO |
| base_model: |
| - nvidia/parakeet-tdt-1.1b |
| --- |
| |
| # Parakeet TDT 1.1B V3 - OpenVINO |
|
|
| [](https://discord.gg/WNsvaCtmDe) |
| [](https://github.com/FluidInference/eddy) |
|
|
| OpenVINO-optimized version of NVIDIA's Parakeet TDT 1.1B V3 model for high-performance multilingual automatic speech recognition on Intel NPUs and CPUs. |
|
|
| ## Benchmark Results |
|
|
| **Hardware**: Intel Core Ultra 7 155H (Meteor Lake) with Intel AI Boost NPU |
| **Software**: OpenVINO 2025.x |
|
|
| ### LibriSpeech test-clean (English) |
|
|
| | Metric | Value | |
| |--------|-------| |
| | **Average WER** | 3.7% | |
| | **Median WER** | 0.0% | |
| | **Average CER** | 1.9% | |
| | **RTFx (NPU)** | 25.7× | |
| | **RTFx (CPU)** | 5-8× | |
| | **Files processed** | 2,620 (5.4 hours) | |
|
|
| ### FLEURS Multilingual (24 Languages) |
|
|
| | Metric | Value | |
| |--------|-------| |
| | **Average WER** | 17.0% | |
| | **Average CER** | 5.4% | |
| | **Average RTFx** | 41.1× | |
| | **Total samples** | ~15,000+ | |
|
|
| **Best performing languages** (WER): Italian 4.3%, Spanish 5.4%, English 6.1%, German 7.4%, French 7.7% |
|
|
| See [BENCHMARK_RESULTS.md](https://github.com/FluidInference/eddy/blob/main/BENCHMARK_RESULTS.md) for complete per-language results. |
|
|
| ## Performance Comparison |
|
|
| | Implementation | Device | RTFx (Avg) | WER (LibriSpeech) | |
| |----------------|--------|------------|-------------------| |
| | **eddy (OpenVINO)** | Intel Core Ultra 7 155H NPU | **25.7×** | 3.7% | |
| | Parakeet (PyTorch) | Intel Arc 140V GPU | ~20×* | ~2.5%* | |
| | **eddy (OpenVINO)** | Intel Core Ultra 7 155H CPU | **5-8×** | 3.7% | |
|
|
| > **Note**: Benchmarked on HP EliteBook Ultra G1i. eddy NPU is ~1.3× faster than PyTorch on Intel Arc GPU, with lower power consumption. *V3 estimated from V2 benchmark. |
| |
| ## Supported Languages |
| |
| **24 European languages**: English, Spanish, Italian, French, German, Dutch, Russian, Polish, Ukrainian, Slovak, Bulgarian, Finnish, Romanian, Croatian, Czech, Swedish, Estonian, Hungarian, Lithuanian, Danish, Maltese, Slovenian, Latvian, Greek |
| |
| ## Usage |
| |
| Python usage via ctypes available - see [eddy repository](https://github.com/FluidInference/eddy) for details. |
| |
| ## Model Details |
| |
| - **Parameters**: 1.1B |
| - **Architecture**: FastConformer-RNNT (4-model pipeline) |
| - **Languages**: 24 European languages |
| - **Blank token ID**: 8192 |
| - **Context window**: 10s chunks with 3s overlap |
| - **Features**: LSTM state continuity, token deduplication, per-token timestamps |
| |
| ## License |
| |
| CC-BY-4.0 - See [LICENSE](LICENSE) for details. |
| |
| ## Links |
| |
| - **GitHub**: [FluidInference/eddy](https://github.com/FluidInference/eddy) |
| - **Base Model**: [nvidia/parakeet-tdt-1.1b](https://huggingface.co/nvidia/parakeet-tdt-1.1b) |
| - **Documentation**: [Benchmark Results](https://github.com/FluidInference/eddy/blob/main/BENCHMARK_RESULTS.md) |
| |
| ## Acknowledgments |
| |
| Based on NVIDIA's Parakeet TDT model. OpenVINO conversion and optimization by the FluidInference team. |
| |