QORA-TTS 1.7B - Pure Rust Text-to-Speech with Voice Cloning Based on Qwen3-TTS-12Hz-1.7B-Base (Apache 2.0).
QORA-TTS 1.7B - Pure Rust Text-to-Speech with Voice Cloning
Pure Rust TTS engine with voice cloning with 25 high quality voices (add unlimited voices). No Python, no CUDA, no safetensors needed. Single executable + Q4 binary = portable TTS.
Based on Qwen3-TTS-12Hz-1.7B-Base (Apache 2.0).
Try: https://huggingface.co/qoranet/QORA-TTS
Voice cloning with included voice
qora-tts.exe --model-path . --ref-audio voices/luna.wav --text "Hello, how are you?" --language english
Different voice
qora-tts.exe --model-path . --ref-audio voices/adam.wav --text "Good morning!" --language english
Clone your own voice (any 24kHz WAV)
qora-tts.exe --model-path . --ref-audio my_recording.wav --text "Custom voice" --language english
Control length (codes = seconds x 12.5)
qora-tts.exe --model-path . --ref-audio voices/luna.wav --text "Short" --max-codes 100
Custom output path
qora-tts.exe --model-path . --ref-audio voices/sagar.wav --text "Hi there" --language english --output greeting.wav
This is very interesting. When you say control length, do you mean literally control the output length (as in allow the voice line to be really slow or really fast, or is it more to do with accuracy)
yes its auto control it u dont need to worry if u feed 2000 token and your system is small slow it will cut it to 200 no need to worry
I see - so similar to batch control in cuda but particularly for CPUs. Does the above model also run with GPU?
Also, if I have a powerful system, does using this model provides any advantages?
only cpu for now but i am making then more smart like i did with llm when can detect gpu if available or cpu is file u can test that feature in the new LLMS : https://huggingface.co/qoranet/QORA-4B