QORA-TTS 1.7B - Pure Rust Text-to-Speech with Voice Cloning Based on Qwen3-TTS-12Hz-1.7B-Base (Apache 2.0).

#41
by drdraq - opened

QORA-TTS 1.7B - Pure Rust Text-to-Speech with Voice Cloning
Pure Rust TTS engine with voice cloning with 25 high quality voices (add unlimited voices). No Python, no CUDA, no safetensors needed. Single executable + Q4 binary = portable TTS.

Based on Qwen3-TTS-12Hz-1.7B-Base (Apache 2.0).

Try: https://huggingface.co/qoranet/QORA-TTS

Voice cloning with included voice

qora-tts.exe --model-path . --ref-audio voices/luna.wav --text "Hello, how are you?" --language english

Different voice

qora-tts.exe --model-path . --ref-audio voices/adam.wav --text "Good morning!" --language english

Clone your own voice (any 24kHz WAV)

qora-tts.exe --model-path . --ref-audio my_recording.wav --text "Custom voice" --language english

Control length (codes = seconds x 12.5)

qora-tts.exe --model-path . --ref-audio voices/luna.wav --text "Short" --max-codes 100

Custom output path

qora-tts.exe --model-path . --ref-audio voices/sagar.wav --text "Hi there" --language english --output greeting.wav

This is very interesting. When you say control length, do you mean literally control the output length (as in allow the voice line to be really slow or really fast, or is it more to do with accuracy)

yes its auto control it u dont need to worry if u feed 2000 token and your system is small slow it will cut it to 200 no need to worry

I see - so similar to batch control in cuda but particularly for CPUs. Does the above model also run with GPU?
Also, if I have a powerful system, does using this model provides any advantages?

only cpu for now but i am making then more smart like i did with llm when can detect gpu if available or cpu is file u can test that feature in the new LLMS : https://huggingface.co/qoranet/QORA-4B

Sign up or log in to comment