Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 21 days ago • 123
mistralai/Voxtral-Mini-4B-Realtime-2602 Automatic Speech Recognition • 4B • Updated Mar 11 • 878k • 812