Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 27 days ago • 123
view article Article Beyond Semantic Similarity: Introducing NVIDIA NeMo Retriever’s Generalizable Agentic Retrieval Pipeline Mar 13 • 40
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 4 days ago • 139
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published Mar 6 • 119
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation Paper • 2602.24286 • Published Feb 27 • 98
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 • 89
Zooming without Zooming: Region-to-Image Distillation for Fine-Grained Multimodal Perception Paper • 2602.11858 • Published Feb 12 • 62
WorldCompass: Reinforcement Learning for Long-Horizon World Models Paper • 2602.09022 • Published Feb 9 • 21
What Drives Success in Physical Planning with Joint-Embedding Predictive World Models? Paper • 2512.24497 • Published Dec 30, 2025 • 7
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published Oct 6, 2025 • 51
view article Article Introducing Daggr: Chain apps programmatically, inspect visually +3 Jan 29 • 106
view article Article Scaling OpenEnv: From Free Usage to Thousands of Concurrent Environments Jan 20 • 12
Transition Matching Distillation for Fast Video Generation Paper • 2601.09881 • Published Jan 14 • 34