I created an API server version of s2.cpp

by mach9243 - opened 29 days ago

This is a fork of https://github.com/rodrigomatta/s2.cpp, which isn't command-line compatible, but it's a server compatible with the Fish Audio API for TTS generation. It also takes advantage of Rodrigo's quantized versions. Bitrates should exceed real-time on RTX40xx cards.
https://github.com/mach92432/s2.cpp

ubergarm

28 days ago

•

edited 28 days ago

Sounds nice, does it support streaming (e.g. it generates like 2 seconds worth at a time and begins HTTP streaming for almost immedeate use?)

I was experimenting with the full model and with torch compile it seems my old 3090TI FE can keep up slightly faster than real time inference...

this was nice on kokoro to get live streaming with just a second of latency for that first chunk to get going...

EDIT

No streaming output — WAV is return to the client only after full generation completes

haha i read the docs ;p

I managed to vibe code a streaming version of the original git repo, but it has glitchy effects between the chunks when combining them together...

RainbowKolors

28 days ago

uuh works on qtz models?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment