Andrewsab's picture
Voice Scribe mirror parakeet from FluidInference/parakeet-tdt-0.6b-v3-ov@dfd55eb6c85a
2cf18a9 verified
---
license: cc-by-4.0
language:
- en
- es
- it
- fr
- de
- nl
- ru
- pl
- uk
- sk
- bg
- fi
- ro
- hr
- cs
- sv
- et
- hu
- lt
- da
- mt
- sl
- lv
- el
pipeline_tag: automatic-speech-recognition
thumbnail: null
tags:
- automatic-speech-recognition
- speech
- audio
- Transducer
- TDT
- FastConformer
- Conformer
- multilingual
- NeMo
- OpenVINO
base_model:
- nvidia/parakeet-tdt-1.1b
---
# Parakeet TDT 1.1B V3 - OpenVINO
[![Discord](https://img.shields.io/badge/Discord-Join%20Chat-7289da.svg)](https://discord.gg/WNsvaCtmDe)
[![GitHub Repo stars](https://img.shields.io/github/stars/FluidInference/eddy?style=flat&logo=github)](https://github.com/FluidInference/eddy)
OpenVINO-optimized version of NVIDIA's Parakeet TDT 1.1B V3 model for high-performance multilingual automatic speech recognition on Intel NPUs and CPUs.
## Benchmark Results
**Hardware**: Intel Core Ultra 7 155H (Meteor Lake) with Intel AI Boost NPU
**Software**: OpenVINO 2025.x
### LibriSpeech test-clean (English)
| Metric | Value |
|--------|-------|
| **Average WER** | 3.7% |
| **Median WER** | 0.0% |
| **Average CER** | 1.9% |
| **RTFx (NPU)** | 25.7× |
| **RTFx (CPU)** | 5-8× |
| **Files processed** | 2,620 (5.4 hours) |
### FLEURS Multilingual (24 Languages)
| Metric | Value |
|--------|-------|
| **Average WER** | 17.0% |
| **Average CER** | 5.4% |
| **Average RTFx** | 41.1× |
| **Total samples** | ~15,000+ |
**Best performing languages** (WER): Italian 4.3%, Spanish 5.4%, English 6.1%, German 7.4%, French 7.7%
See [BENCHMARK_RESULTS.md](https://github.com/FluidInference/eddy/blob/main/BENCHMARK_RESULTS.md) for complete per-language results.
## Performance Comparison
| Implementation | Device | RTFx (Avg) | WER (LibriSpeech) |
|----------------|--------|------------|-------------------|
| **eddy (OpenVINO)** | Intel Core Ultra 7 155H NPU | **25.7×** | 3.7% |
| Parakeet (PyTorch) | Intel Arc 140V GPU | ~20×* | ~2.5%* |
| **eddy (OpenVINO)** | Intel Core Ultra 7 155H CPU | **5-8×** | 3.7% |
> **Note**: Benchmarked on HP EliteBook Ultra G1i. eddy NPU is ~1.3× faster than PyTorch on Intel Arc GPU, with lower power consumption. *V3 estimated from V2 benchmark.
## Supported Languages
**24 European languages**: English, Spanish, Italian, French, German, Dutch, Russian, Polish, Ukrainian, Slovak, Bulgarian, Finnish, Romanian, Croatian, Czech, Swedish, Estonian, Hungarian, Lithuanian, Danish, Maltese, Slovenian, Latvian, Greek
## Usage
Python usage via ctypes available - see [eddy repository](https://github.com/FluidInference/eddy) for details.
## Model Details
- **Parameters**: 1.1B
- **Architecture**: FastConformer-RNNT (4-model pipeline)
- **Languages**: 24 European languages
- **Blank token ID**: 8192
- **Context window**: 10s chunks with 3s overlap
- **Features**: LSTM state continuity, token deduplication, per-token timestamps
## License
CC-BY-4.0 - See [LICENSE](LICENSE) for details.
## Links
- **GitHub**: [FluidInference/eddy](https://github.com/FluidInference/eddy)
- **Base Model**: [nvidia/parakeet-tdt-1.1b](https://huggingface.co/nvidia/parakeet-tdt-1.1b)
- **Documentation**: [Benchmark Results](https://github.com/FluidInference/eddy/blob/main/BENCHMARK_RESULTS.md)
## Acknowledgments
Based on NVIDIA's Parakeet TDT model. OpenVINO conversion and optimization by the FluidInference team.