-
nvidia/Nemotron-Cascade-2-30B-A3B
Text Generation • 32B • Updated • 317k • 476 -
nvidia/Nemotron-Cascade-2-RL-data
Viewer • Updated • 55.7k • 1.41k • 47 -
nvidia/Nemotron-Cascade-2-SFT-Data
Viewer • Updated • 15.9M • 18.6k • 54 -
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 66
Collections
Discover the best community collections!
Collections including paper arxiv:2603.19220
-
MapTrace: Scalable Data Generation for Route Tracing on Maps
Paper • 2512.19609 • Published • 3 -
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 66 -
nyu-mll/glue
Viewer • Updated • 1.49M • 403k • 489 -
Grounding World Simulation Models in a Real-World Metropolis
Paper • 2603.15583 • Published • 153
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 15.5k • 1.43k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 96 • 17 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
nvidia/Nemotron-Cascade-2-30B-A3B
Text Generation • 32B • Updated • 317k • 476 -
nvidia/Nemotron-Cascade-2-RL-data
Viewer • Updated • 55.7k • 1.41k • 47 -
nvidia/Nemotron-Cascade-2-SFT-Data
Viewer • Updated • 15.9M • 18.6k • 54 -
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 66
-
MapTrace: Scalable Data Generation for Route Tracing on Maps
Paper • 2512.19609 • Published • 3 -
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 66 -
nyu-mll/glue
Viewer • Updated • 1.49M • 403k • 489 -
Grounding World Simulation Models in a Real-World Metropolis
Paper • 2603.15583 • Published • 153
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 15.5k • 1.43k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 96 • 17 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63