Instructions to use Supertone/supertonic-3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Supertonic
How to use Supertone/supertonic-3 with Supertonic:
from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance." wav, duration = tts.synthesize(text, voice_style=style) tts.save_audio(wav, "output.wav")
- Notebooks
- Google Colab
- Kaggle
Polish numeral inflection
The TTS model incorrectly reads Polish hour expressions using cardinal numbers instead of ordinal numbers.
For example, the sentence:
“Wybiła godzina 13”
is synthesized as:
“wybiła godzina trzynaście”
instead of the grammatically correct:
“wybiła godzina trzynasta”.
In Polish, hour expressions require ordinal numeral inflection agreeing with the noun “godzina”. The model currently fails to apply the correct morphological transformation.
Hi,
Thank you for the clear example and explanation.
For Supertonic 3, we tried to rely less on language-specific text normalization rules and instead let the model learn these reading patterns from diverse training data. However, the amount and coverage of data differs by language, and cases like Polish hour expressions can still fail when the model has not seen enough of the relevant grammatical patterns.
Your example is very helpful:
Wybiła godzina 13 should be read as wybiła godzina trzynasta, not trzynaście.
We’ll document this as a Polish text normalization / morphology issue and use it as a reference case for future updates. In the next versions, we’ll try to improve this kind of language-specific reading behavior so Polish time expressions and similar constructions are handled more naturally.
Thank you again for reporting it.