Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
inference-optimization
's Collections
HIGGS
test-models
Granite 4 Small and Tiny Quantized Models
NVIDIA-Nemotron-3-Nano-30B-A3B Quantized Models
Qwen3-Next-80B-A3B Quantized Models
KV Cache Quantization
NVIDIA-Nemotron-3-Nano-30B-A3B Quantized Models
updated
Mar 2
FP8-dynamic, FP8-block, NVFP4, INT4, versions of nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B
Upvote
-
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation
•
32B
•
Updated
Jan 9
•
3
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4
18B
•
Updated
Jan 15
•
6
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-quantized.w4a16
6B
•
Updated
Jan 7
•
6
inference-optimization/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8-dynamic
32B
•
Updated
Jan 6
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections