Inference speed
#10
by banank1989 - opened
Can it convert an 10-15 words sentence within 1 or 2 sec? I tried but seemed too slow(10-15sec)
@banank1989 did you get any solution for speedup the TTS? I am using G5 instance it takes around 5-6 seconds for 10 tokens. Pls suggest if you found anything to do it for real-time TTS
You can the model with flash attention in streaming mode.
how do i do that, i have tried all the optimization method they have suggested is there any other way to do that?
You can the model with flash attention in streaming mode.
I tried implementing https://github.com/huggingface/parler-tts/blob/main/INFERENCE.md in this current model and its not supporting at all.
Could you please guide where we are going wrong?