Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
Abstract
The submission details an end-to-end Irish-to-English speech translation system using Whisper, incorporating data augmentation techniques like speech back-translation and noise augmentation, and evaluates the use of synthetic audio data.
This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2024) for Irish-to-English speech translation. We built end-to-end systems based on Whisper, and employed a number of data augmentation techniques, such as speech back-translation and noise augmentation. We investigate the effect of using synthetic audio data and discuss several methods for enriching signal diversity.
Get this paper in your agent:
hf papers read 2406.17363 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 4
Spaces citing this paper 0
No Space linking this paper