FunASR Paraformer ASR Model
Paraformer large model for automatic speech recognition supporting Mandarin, Cantonese, and English (16kHz).
Original Source: ModelScope
License: Apache 2.0
Usage: This model is used by Step Audio EditX for VQ02 audio tokenization.
Model Details
- Architecture: Paraformer (Non-autoregressive Transformer)
- Languages: Mandarin, Cantonese, English
- Sample Rate: 16kHz
- Size: ~881MB
Citation
@inproceedings{gao2022paraformer,
title={Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition},
author={Gao, Zhifu and Zhang, Shiliang and McLoughlin, Ian and Yan, Zhijie},
booktitle={INTERSPEECH},
year={2022}
}
- Downloads last month
- 177
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support