Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models Paper • 2604.01622 • Published 16 days ago • 7
Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models Paper • 2604.01622 • Published 16 days ago • 7
Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models Paper • 2604.01622 • Published 16 days ago • 7
Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States Paper • 2603.19987 • Published 28 days ago • 9
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Paper • 2412.02595 • Published Dec 3, 2024 • 8