NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 8 days ago • 267
Nemotron-Cascade 2 Collection Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 8 days ago • 48
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 26 days ago • 66
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 230
Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models Paper • 2512.13607 • Published Dec 15, 2025 • 38
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 14 items • Updated 8 days ago • 54
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 126
Perception-Aware Policy Optimization for Multimodal Reasoning Paper • 2507.06448 • Published Jul 8, 2025 • 48
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30, 2025 • 146
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17, 2025 • 97
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks Paper • 2501.11733 • Published Jan 20, 2025 • 28
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis Paper • 2501.04561 • Published Jan 8, 2025 • 16