LH-Tech-AI
/

Flare-TTS-v1.5

Model card Files Files and versions

LH-Tech-AI commited on 2 days ago

Commit

d3a78a3

·

verified ·

1 Parent(s): 4d2d805

Update README.md

Files changed (1) hide show

README.md +57 -1

README.md CHANGED Viewed

	@@ -1 +1,57 @@
1	- ~~NOW WITH VOCODER!~~

+---
+language:
+- en
+pipeline_tag: text-to-speech
+tags:
+- tts
+- flare
+- open
+- open-source
+- small
+- speech
+- text-to-speech
+- tiny
+- cpu
+datasets:
+- keithito/lj_speech
+---
+# 🎙️ Flare-TTS v1.5 28M
+Welcome to Flare-TTS **v1.5** 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.
+<br>
+This is an improved version of Flare-TTS 28M (v1) which is now using a vocoder to remove these robotic sounds!
+## Quality and results
+This model has a much better quality now, it doesn't sound robotish anymore and you can clearly understand what the model says.
+<br>
+Example:
+<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/vluuHSnp9Ietk7Uk1-hvG.mpga"></audio>
+## Training process
+We trained this model for ~300 epochs on a single A6000 GPU for ~24 hours. Note that this model is based on the first version Flare-TTS 28M.
+Furthermore, this model now uses a vocoder - see train_vocoder.py for more information and the full code.
+The full training code for the vocoder can be found in this repo as `prepare.sh` and `train_vocoder.py`.
+<br>
+The full pretraining code is here: https://huggingface.co/LH-Tech-AI/Flare-TTS-28M/tree/main
+## Architecture
+This model was trained using CoquiTTS. For the architecture we chose GlowTTS.
+## Training dataset
+We trained on the full LJSpeech dataset. Thanks to keithito for this :-)
+## How to use
+As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
+```bash
+tts --text "Hello, world! This is the second version of Flare-TTS - now with a vocoder. The robot sounds are finally gone!" \
+    --model_path ./model.pth \
+    --config_path ./config.json \
+    --vocoder_path ./vocoder.pth \
+    --vocoder_config_path ./vocoder_config.json \
+    --out_path output_1.wav
+```
+## Final thoughts
+This model is much better in the audio quality than the first version of Flare-TTS 28M.
+<br>
+But stay tuned for a third version with more features! :D