SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding Paper • 2604.09557 • Published Feb 10 • 10
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 2 days ago • 17
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2604.12374 • Published 2 days ago • 17
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding Paper • 2604.09557 • Published Feb 10 • 10
SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding Paper • 2604.09557 • Published Feb 10 • 10
view article Article Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding 27 days ago • 45
view article Article Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding 27 days ago • 45
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs Paper • 2411.19146 • Published Nov 28, 2024 • 20