Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 28 days ago • 36
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published Feb 13 • 58
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published Feb 15 • 53
nvidia/diar_sortformer_4spk-v1 Automatic Speech Recognition • 0.1B • Updated Dec 15, 2025 • 5.45k • 137
Why LLMs Aren't Scientists Yet: Lessons from Four Autonomous Research Attempts Paper • 2601.03315 • Published Jan 6 • 6
OmniPSD: Layered PSD Generation with Diffusion Transformer Paper • 2512.09247 • Published Dec 10, 2025 • 51