I created an API server version of s2.cpp

#4
by mach9243 - opened

This is a fork of https://github.com/rodrigomatta/s2.cpp, which isn't command-line compatible, but it's a server compatible with the Fish Audio API for TTS generation. It also takes advantage of Rodrigo's quantized versions. Bitrates should exceed real-time on RTX40xx cards.
https://github.com/mach92432/s2.cpp

Sounds nice, does it support streaming (e.g. it generates like 2 seconds worth at a time and begins HTTP streaming for almost immedeate use?)

I was experimenting with the full model and with torch compile it seems my old 3090TI FE can keep up slightly faster than real time inference...

this was nice on kokoro to get live streaming with just a second of latency for that first chunk to get going...

EDIT

No streaming output β€” WAV is return to the client only after full generation completes

haha i read the docs ;p


I managed to vibe code a streaming version of the original git repo, but it has glitchy effects between the chunks when combining them together...

uuh works on qtz models?

Sign up or log in to comment