LH-Tech-AI
/

Flare-TTS-v1.5

Model card Files Files and versions

Flare-TTS-v1.5 / README.md

LH-Tech-AI's picture

Update README.md

bf9c44c verified 2 days ago

|

history blame contribute delete

2.04 kB

	---
	language:
	- en
	pipeline_tag: text-to-speech
	tags:
	- tts
	- flare
	- open
	- open-source
	- small
	- speech
	- text-to-speech
	- tiny
	- cpu
	datasets:
	- keithito/lj_speech
	---

	# 🎙️ Flare-TTS v1.5 28M
	Welcome to Flare-TTS v1.5 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.
	<br>
	This is an improved version of Flare-TTS 28M (v1) which is now using a vocoder to remove these robotic sounds!

	## Quality and results
	This model has a much better quality now, it doesn't sound robotish anymore and you can clearly understand what the model says.
	<br>
	Example:


	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/qohVONCxMMh-68Z2U7r46.wav"></audio>

	## Training process
	We trained the vocoder for 72 epochs on a single A6000 GPU for ~10 hours. Note that this model is based on the first version Flare-TTS 28M.
	Furthermore, this model now uses a vocoder - see train_vocoder.py for more information and the full code.
	The full training code for the vocoder can be found in this repo as `prepare.sh` and `train_vocoder.py`.
	<br>
	The full pretraining code is here: https://huggingface.co/LH-Tech-AI/Flare-TTS-28M/tree/main

	## Architecture
	This model was trained using CoquiTTS. For the architecture we chose GlowTTS.

	## Training dataset
	We trained on the full LJSpeech dataset. Thanks to keithito for this :-)

	## How to use
	As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
	```bash
	tts --text "Hello, world! This is the second version of Flare-TTS - now with a vocoder. The robot sounds are finally gone!" \
	--model_path ./model.pth \
	--config_path ./config.json \
	--vocoder_path ./vocoder_15000_checkpoint.pth \
	--vocoder_config_path ./vocoder_config.json \
	--out_path output_1.wav
	```

	## Final thoughts
	This model is much better in the audio quality than the first version of Flare-TTS 28M.
	<br>
	But stay tuned for a third version with more features! :D