Supports voice cloning?

#10

by nijatzeynalov - opened 10 days ago

Discussion

nijatzeynalov

10 days ago

Hi, nice work!

Does this model supports voice cloning?

juheon2

Supertone org 10 days ago

Hi, thanks for your interest!

The open-weight Supertonic model does not support voice cloning directly. It includes a fixed set of pre-defined voice styles.

For Supertonic 2, we previously provided a Voice Builder service where users could purchase zero-shot voice cloning embeddings:
https://supertonic.supertone.ai/voice-builder

Supertonic 3 is not supported in Voice Builder yet. If we add Voice Builder or voice cloning support for Supertonic 3 in the future, we’ll share an update.

juheon2 changed discussion status to closed 10 days ago

juheon2

Supertone org 4 days ago

Good news — Voice Builder support for Supertonic 3 is now available.

The open-weight model itself still includes fixed preset voice styles and does not perform voice cloning directly from audio inside the model package. However, you can now use our Voice Builder service to create a custom Supertonic 3 voice style from a reference voice.

Also, if you previously purchased a Voice Builder style for Supertonic 2, we are providing the corresponding Supertonic 3 version free of charge.

You can access Voice Builder here:
https://supertonic.supertone.ai/voice-builder

Thanks again for your interest!

juheon2 changed discussion status to open 4 days ago

blallo27

4 days ago

so you locked the most interesting feature behind a paywall? is there no intention of sharing a local version?

juheon2

Supertone org 4 days ago

•

edited 4 days ago

Hi @blallo27 ,

I understand the concern. Local voice creation is a very reasonable thing to ask for, especially since Supertonic itself is designed to run on-device.

For this release, though, the open-weight part of Supertonic 3 is the local inference model with the provided preset voice styles. Custom voice creation is currently handled through Voice Builder as a separate product. The resulting voice style can be used with Supertonic 3, but the extraction pipeline itself is not included in the open-weight release.

So the direct answer is: we do not currently plan to release a local voice-cloning / voice-style extraction pipeline.

I know that may not be the answer everyone wants, but I wanted to be clear about the current scope rather than give a vague maybe. We’ll keep listening to feedback here as we think about future releases.

Thanks for raising it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment