| --- |
| language: |
| - en |
| pipeline_tag: text-to-speech |
| tags: |
| - tts |
| - flare |
| - open |
| - open-source |
| - small |
| - speech |
| - text-to-speech |
| - tiny |
| - cpu |
| datasets: |
| - keithito/lj_speech |
| --- |
| |
| # 🎙️ Flare-TTS v1.5 28M |
| Welcome to Flare-TTS **v1.5** 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech. |
| <br> |
| This is an improved version of Flare-TTS 28M (v1) which is now using a vocoder to remove these robotic sounds! |
|
|
| ## Quality and results |
| This model has a much better quality now, it doesn't sound robotish anymore and you can clearly understand what the model says. |
| <br> |
| Example: |
|
|
|
|
| <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/qohVONCxMMh-68Z2U7r46.wav"></audio> |
|
|
| ## Training process |
| We trained the vocoder for 72 epochs on a single A6000 GPU for ~10 hours. Note that this model is based on the first version Flare-TTS 28M. |
| Furthermore, this model now uses a vocoder - see train_vocoder.py for more information and the full code. |
| The full training code for the vocoder can be found in this repo as `prepare.sh` and `train_vocoder.py`. |
| <br> |
| The full pretraining code is here: https://huggingface.co/LH-Tech-AI/Flare-TTS-28M/tree/main |
|
|
| ## Architecture |
| This model was trained using CoquiTTS. For the architecture we chose GlowTTS. |
|
|
| ## Training dataset |
| We trained on the full LJSpeech dataset. Thanks to keithito for this :-) |
|
|
| ## How to use |
| As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using: |
| ```bash |
| tts --text "Hello, world! This is the second version of Flare-TTS - now with a vocoder. The robot sounds are finally gone!" \ |
| --model_path ./model.pth \ |
| --config_path ./config.json \ |
| --vocoder_path ./vocoder_15000_checkpoint.pth \ |
| --vocoder_config_path ./vocoder_config.json \ |
| --out_path output_1.wav |
| ``` |
|
|
| ## Final thoughts |
| This model is much better in the audio quality than the first version of Flare-TTS 28M. |
| <br> |
| But stay tuned for a third version with more features! :D |