File size: 1,649 Bytes
808f8f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e4aa044
808f8f4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
language:
- en
pipeline_tag: text-to-speech
tags:
- tts
- flare
- open
- open-source
- small
- speech
- text-to-speech
- tiny
- cpu
datasets:
- keithito/lj_speech
new_version: LH-Tech-AI/Flare-TTS-v1.5
---

# 🎙️ Flare-TTS 28M
Welcome to Flare-TTS 28M, an open-source text-to-speech model with 28 million parameters trained on LJSpeech.

## Quality and results
This model is okayish quality but it still sounds a bit robotish but you can clearly understand what the model tries to say.
See this model as a proof-of-concept or a first-beta.
Example:
<audio controls src="https://cdn-uploads.huggingface.co/production/uploads/697f2832c2c5e4daa93cece7/vluuHSnp9Ietk7Uk1-hvG.mpga"></audio>

## Training process
We trained this model for ~300 epochs on a single A6000 GPU for ~24 hours.
The full training code can be found in this repo as `start.sh` and `train.py`. Just run `start.sh` to train this model yourself.

## Architecture
This model was trained using CoquiTTS. For the architecture we chose GlowTTS.

## Training dataset
We trained on the full LJSpeech dataset. Thanks to keithito for this :-)

## How to use
As soon as you have the model checkpoint (`model.pth`) and `config.json` on your device, you can generate a sample using:
```bash
tts --text "Hello world, this is my first trained TTS model." \
    --model_path model.pth \
    --config_path config.json \
    --out_path output_1.wav
```

## Final thoughts
We don't think it's perfect - it's more like a proof of concept. So please do not use this model for production use cases but more for experiments.
We are happy to share more of this soon - stay tuned for Flare-TTS v2 :D