MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 8 days ago • 38
FastVLM Collection Efficient Vision Encoding for Vision Language Models • 8 items • Updated Mar 2 • 111
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published 15 days ago • 9