LH-Tech-AI commited on
Commit
d3a78a3
·
verified ·
1 Parent(s): 4d2d805

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -1
README.md CHANGED
@@ -1 +1,57 @@
1
- NOW WITH VOCODER!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-to-speech
5
+ tags:
6
+ - tts
7
+ - flare
8
+ - open
9
+ - open-source
10
+ - small
11
+ - speech
12
+ - text-to-speech
13
+ - tiny
14
+ - cpu
15
+ datasets:
16
+ - keithito/lj_speech
17
+ ---
18
+
19
+ # 🎙️ Flare-TTS v1.5 28M
20
+ Welcome to Flare-TTS **v1.5** 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.
21
+ <br>
22
+ This is an improved version of Flare-TTS 28M (v1) which is now using a vocoder to remove these robotic sounds!
23
+
24
+ ## Quality and results
25
+ This model has a much better quality now, it doesn't sound robotish anymore and you can clearly understand what the model says.
26
+ <br>
27
+ Example:
28
+ <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/vluuHSnp9Ietk7Uk1-hvG.mpga"></audio>
29
+
30
+ ## Training process
31
+ We trained this model for ~300 epochs on a single A6000 GPU for ~24 hours. Note that this model is based on the first version Flare-TTS 28M.
32
+ Furthermore, this model now uses a vocoder - see train_vocoder.py for more information and the full code.
33
+ The full training code for the vocoder can be found in this repo as `prepare.sh` and `train_vocoder.py`.
34
+ <br>
35
+ The full pretraining code is here: https://huggingface.co/LH-Tech-AI/Flare-TTS-28M/tree/main
36
+
37
+ ## Architecture
38
+ This model was trained using CoquiTTS. For the architecture we chose GlowTTS.
39
+
40
+ ## Training dataset
41
+ We trained on the full LJSpeech dataset. Thanks to keithito for this :-)
42
+
43
+ ## How to use
44
+ As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
45
+ ```bash
46
+ tts --text "Hello, world! This is the second version of Flare-TTS - now with a vocoder. The robot sounds are finally gone!" \
47
+ --model_path ./model.pth \
48
+ --config_path ./config.json \
49
+ --vocoder_path ./vocoder.pth \
50
+ --vocoder_config_path ./vocoder_config.json \
51
+ --out_path output_1.wav
52
+ ```
53
+
54
+ ## Final thoughts
55
+ This model is much better in the audio quality than the first version of Flare-TTS 28M.
56
+ <br>
57
+ But stay tuned for a third version with more features! :D