No emotions or instructions on cloned voices?
Is there no support for adding emotions or instruction for cloned voices?
Only pre-trained voices?
How is this not answered yet
I believe that is correct. We put extensive hours into experimentation to figure out techniques, but it appears only the custom voice models have emotion instructions. The voice cloning/base model does not have that support.
We plan to support the QWEN3 series of models in our application. We were very close to fabricating emotion support but dropped it due to inconsistent and unpredictable outcomes, which worsened voice quality and often changed the speaker's voice. We have now fallen back to chatterbox models, as they are much more predictable and reliable with broader language support. The quality of the chatterbox models is fairly decent compared to the unpredictability of the Qwen3 models.