LH-Tech-AI
/

Flare-TTS-28M

Model card Files Files and versions

Flare-TTS-28M / README.md

LH-Tech-AI's picture

Update README.md

e4aa044 verified about 23 hours ago

|

history blame contribute delete

1.65 kB

	---
	language:
	- en
	pipeline_tag: text-to-speech
	tags:
	- tts
	- flare
	- open
	- open-source
	- small
	- speech
	- text-to-speech
	- tiny
	- cpu
	datasets:
	- keithito/lj_speech
	new_version: LH-Tech-AI/Flare-TTS-v1.5
	---

	# 🎙️ Flare-TTS 28M
	Welcome to Flare-TTS 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.

	## Quality and results
	This model is okayish quality but it still sounds a bit robotish but you can clearly understand what the model tries to say.
	See this model as a proof-of-concept or a first-beta.
	Example:
	<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/vluuHSnp9Ietk7Uk1-hvG.mpga"></audio>

	## Training process
	We trained this model for ~300 epochs on a single A6000 GPU for ~24 hours.
	The full training code can be found in this repo as `start.sh` and `train.py`. Just run `start.sh` to train this model yourself.

	## Architecture
	This model was trained using CoquiTTS. For the architecture we chose GlowTTS.

	## Training dataset
	We trained on the full LJSpeech dataset. Thanks to keithito for this :-)

	## How to use
	As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
	```bash
	tts --text "Hello world, this is my first trained TTS model." \
	--model_path model.pth \
	--config_path config.json \
	--out_path output_1.wav
	```

	## Final thoughts
	We don't think it's perfect - it's more like a proof of concept. So please do not use this model for production use cases but more for experiments.
	We are happy to share more of this soon - stay tuned for Flare-TTS v2 :D