File size: 1,649 Bytes
808f8f4 e4aa044 808f8f4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ---
language:
- en
pipeline_tag: text-to-speech
tags:
- tts
- flare
- open
- open-source
- small
- speech
- text-to-speech
- tiny
- cpu
datasets:
- keithito/lj_speech
new_version: LH-Tech-AI/Flare-TTS-v1.5
---
# 🎙️ Flare-TTS 28M
Welcome to Flare-TTS 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.
## Quality and results
This model is okayish quality but it still sounds a bit robotish but you can clearly understand what the model tries to say.
See this model as a proof-of-concept or a first-beta.
Example:
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/vluuHSnp9Ietk7Uk1-hvG.mpga"></audio>
## Training process
We trained this model for ~300 epochs on a single A6000 GPU for ~24 hours.
The full training code can be found in this repo as `start.sh` and `train.py`. Just run `start.sh` to train this model yourself.
## Architecture
This model was trained using CoquiTTS. For the architecture we chose GlowTTS.
## Training dataset
We trained on the full LJSpeech dataset. Thanks to keithito for this :-)
## How to use
As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
```bash
tts --text "Hello world, this is my first trained TTS model." \
--model_path model.pth \
--config_path config.json \
--out_path output_1.wav
```
## Final thoughts
We don't think it's perfect - it's more like a proof of concept. So please do not use this model for production use cases but more for experiments.
We are happy to share more of this soon - stay tuned for Flare-TTS v2 :D |