--- library_name: pytorch license: other tags: - real_time - android pipeline_tag: text-to-audio --- ![](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/web-assets/model_demo.png) # MeloTTS-ES: Optimized for Qualcomm Devices MeloTTS is a high-quality multi-lingual text-to-speech library for English, Chinese and Spanish language. This is based on the implementation of MeloTTS-ES found [here](https://github.com/myshell-ai/MeloTTS). This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/src/qai_hub_models/models/melotts_es) library to export with custom configurations. More details on model performance across various devices, can be found [here](#performance-summary). Qualcomm AI Hub Models uses [Qualcomm AI Hub Workbench](https://workbench.aihub.qualcomm.com) to compile, profile, and evaluate this model. [Sign up](https://myaccount.qualcomm.com/signup) to run these models on a hosted Qualcomm® device. ## Getting Started There are two ways to deploy this model on your device: ### Option 1: Download Pre-Exported Models Below are pre-exported model assets ready for deployment. | Runtime | Precision | Chipset | SDK Versions | Download | |---|---|---|---|---| | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_snapdragon_8_elite_gen5.zip) | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_snapdragon_x2_elite.zip) | VOICE_AI | mixed_with_float | Snapdragon® X Elite | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_snapdragon_x_elite.zip) | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_snapdragon_8gen3.zip) | VOICE_AI | mixed_with_float | qualcomm-qcs8275 | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_qcs8275.zip) | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_qcs8550_proxy.zip) | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_sa8775p.zip) | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_snapdragon_8_elite_for_galaxy.zip) | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_sa7255p.zip) | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_sa8295p.zip) | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_qcs9075.zip) | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | QAIRT 2.45 | [Download](https://qaihub-public-assets.s3.us-west-2.amazonaws.com/qai-hub-models/models/melotts_es/releases/v0.51.0/melotts_es-voice_ai-mixed_with_float-qualcomm_qcs8450_proxy.zip) For more device-specific assets and performance metrics, visit **[MeloTTS-ES on Qualcomm® AI Hub](https://aihub.qualcomm.com/models/melotts_es)**. ### Option 2: Export with Custom Configurations Use the [Qualcomm® AI Hub Models](https://github.com/qualcomm/ai-hub-models/blob/main/src/qai_hub_models/models/melotts_es) Python library to compile and export the model with your own: - Custom weights (e.g., fine-tuned checkpoints) - Custom input shapes - Target device and runtime configurations This option is ideal if you need to customize the model beyond the default configuration provided here. See our repository for [MeloTTS-ES on GitHub](https://github.com/qualcomm/ai-hub-models/blob/main/src/qai_hub_models/models/melotts_es) for usage instructions. ## Model Details **Model Type:** Model_use_case.audio_generation **Model Stats:** - Model checkpoint: myshell-ai/MeloTTS-Spanish - Max decoded sequence length: 512 tokens - Number of parameters (encoder): 8.36M - Model size (encoder) (float): 32.0 MB - Number of parameters (flow): 20.1M - Model size (flow) (float): 76.9 MB - Number of parameters (decoder): 14.5M - Model size (decoder) (float): 55.5 MB - Number of parameters (t5_encoder): 15.1M - Model size (t5_encoder) (float): 57.5 MB - Number of parameters (t5_decoder): 5.72M - Model size (t5_decoder) (float): 21.8 MB ## Performance Summary | Model | Runtime | Precision | Chipset | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit |---|---|---|---|---|---|--- | decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | 43.603 ms | 0 - 10 MB | NPU | decoder | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | 40.761 ms | 0 - 0 MB | NPU | decoder | VOICE_AI | mixed_with_float | Snapdragon® X Elite | 82.709 ms | 0 - 0 MB | NPU | decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | 61.145 ms | 0 - 8 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8275 (Proxy) | 134.528 ms | 0 - 9 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | 85.581 ms | 0 - 2 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | 83.845 ms | 0 - 9 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | 83.281 ms | 0 - 2 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | 113.955 ms | 1 - 9 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | 134.528 ms | 0 - 9 MB | NPU | decoder | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | 100.024 ms | 0 - 6 MB | NPU | decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | 47.962 ms | 0 - 9 MB | NPU | encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | 19.517 ms | 4 - 13 MB | NPU | encoder | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | 20.57 ms | 4 - 4 MB | NPU | encoder | VOICE_AI | mixed_with_float | Snapdragon® X Elite | 32.975 ms | 4 - 4 MB | NPU | encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | 25.204 ms | 4 - 11 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8275 (Proxy) | 52.553 ms | 2 - 10 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | 34.626 ms | 4 - 5 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | 36.213 ms | 2 - 11 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | 35.679 ms | 4 - 9 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | 44.062 ms | 4 - 13 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | 52.553 ms | 2 - 10 MB | NPU | encoder | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | 40.778 ms | 0 - 5 MB | NPU | encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | 20.302 ms | 2 - 15 MB | NPU | flow | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | 64.613 ms | 2 - 11 MB | NPU | flow | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | 61.925 ms | 2 - 2 MB | NPU | flow | VOICE_AI | mixed_with_float | Snapdragon® X Elite | 122.177 ms | 2 - 2 MB | NPU | flow | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | 90.362 ms | 2 - 10 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® QCS8275 (Proxy) | 235.037 ms | 2 - 11 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | 123.207 ms | 3 - 5 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | 121.083 ms | 2 - 11 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | 120.456 ms | 4 - 8 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | 205.615 ms | 2 - 12 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | 235.037 ms | 2 - 11 MB | NPU | flow | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | 150.571 ms | 0 - 5 MB | NPU | flow | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | 79.131 ms | 2 - 11 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | 0.256 ms | 0 - 9 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | 0.328 ms | 1 - 1 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Snapdragon® X Elite | 0.429 ms | 1 - 1 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | 0.305 ms | 0 - 8 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8275 (Proxy) | 0.983 ms | 0 - 9 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | 0.401 ms | 1 - 2 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | 0.667 ms | 0 - 10 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | 0.512 ms | 1 - 3 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | 0.578 ms | 1 - 10 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | 0.983 ms | 0 - 9 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | 0.796 ms | 0 - 5 MB | NPU | t5_decoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | 0.271 ms | 0 - 9 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite Gen 5 Mobile | 0.482 ms | 0 - 10 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Snapdragon® X2 Elite | 0.654 ms | 0 - 0 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Snapdragon® X Elite | 1.052 ms | 0 - 0 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Gen 3 Mobile | 0.636 ms | 0 - 7 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8275 (Proxy) | 2.839 ms | 0 - 9 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8550 (Proxy) | 0.877 ms | 0 - 1 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® SA8775P | 1.275 ms | 0 - 9 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS9075 | 1.118 ms | 0 - 2 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® QCS8450 (Proxy) | 1.358 ms | 0 - 9 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® SA7255P | 2.839 ms | 0 - 9 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Qualcomm® SA8295P | 1.721 ms | 0 - 5 MB | NPU | t5_encoder | VOICE_AI | mixed_with_float | Snapdragon® 8 Elite For Galaxy Mobile | 0.523 ms | 0 - 9 MB | NPU ## License * The license for the original implementation of MeloTTS-ES can be found [here](https://github.com/myshell-ai/MeloTTS/blob/main/LICENSE). ## References * [MeloTTS High-quality Multi-lingual Multi-accent Text-to-Speech](https://github.com/myshell-ai/MeloTTS) * [Source Model Implementation](https://github.com/myshell-ai/MeloTTS) ## Community * Join [our AI Hub Slack community](https://aihub.qualcomm.com/community/slack) to collaborate, post questions and learn more about on-device AI. * For questions or feedback please [reach out to us](mailto:ai-hub-support@qti.qualcomm.com).