Text-to-Audio
updated
Large-Scale Automatic Audiobook Creation
Paper
• 2309.03926
• Published • 55
FoleyGen: Visually-Guided Audio Generation
Paper
• 2309.10537
• Published • 8
MusicAgent: An AI Agent for Music Understanding and Generation with
Large Language Models
Paper
• 2310.11954
• Published • 24
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
Paper
• 2310.00704
• Published • 20
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper
• 2311.00945
• Published • 16
In-Context Prompt Editing For Conditional Audio Generation
Paper
• 2311.00895
• Published • 10
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper
• 2312.03491
• Published • 34
PicoAudio: Enabling Precise Timestamp and Frequency Controllability of
Audio Events in Text-to-audio Generation
Paper
• 2407.02869
• Published • 21
FunAudioLLM: Voice Understanding and Generation Foundation Models for
Natural Interaction Between Humans and LLMs
Paper
• 2407.04051
• Published • 40
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music
Generation
Paper
• 2407.15060
• Published • 9
Improving Text-To-Audio Models with Synthetic Captions
Paper
• 2406.15487
• Published