Nemotron-Cascade 2 Collection Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 3 days ago • 48
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published Mar 6 • 119
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 3 days ago • 268
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 3 days ago • 123
MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks Paper • 2601.14652 • Published Jan 21 • 4
ToolComp: A Multi-Tool Reasoning & Process Supervision Benchmark Paper • 2501.01290 • Published Jan 2, 2025 • 1
Nemotron-Terminal Collection We are releasing Nemotron-Terminal models and training datasets. • 5 items • Updated 3 days ago • 34
Endless Terminals: Scaling RL Environments for Terminal Agents Paper • 2601.16443 • Published Jan 23 • 18
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents Paper • 2602.07274 • Published Feb 6 • 208
Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers Paper • 2602.18292 • Published Feb 20 • 13
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28, 2025 • 179
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Jan 27 • 71
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 16 items • Updated 9 days ago • 15
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published Jan 20 • 57
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering Paper • 2507.11527 • Published Jul 15, 2025 • 35