Tip: Improve pronunciation of numbers & abbreviations with a text normalizer
If you've noticed the model is struggling with numbers, dates, or abbreviations — this is a
known limitation of all (known to me) bulgarian TTS models.
A simple and effective workaround is to pre-process your input text using bg_text_normalizer
before passing it to the model. This library converts Bulgarian numbers, abbreviations, and
other non-standard words into their spoken form, which the model handles much more
reliably.
Example:
from bg_text_normalizer import normalize
text = "Цената е 1500 лв. за м² в кв. Лозенец."
normalized = normalize(text)
# → "Цената е хиляда и петстотин лева за квадратен метър в квартал Лозенец."
# Then pass `normalized` to the TTS model
This small pre-processing step can significantly improve the naturalness and accuracy of the
synthesized speech, especially for texts containing numerals, currency, units of
measurement, or common abbreviations.
Hope this helps others getting the most out of this model!
„Благодаря за предложението, bg_text_normalizer определено ще подобри нещата!