Subscribe and Support

This is google/gemma-4-31B-it quantized with AutoRound to NVFP4. The model is compatible with vLLM (tested: v0.19). Tested with an RTX Pro 6000. Currently under evaluation.

Developed by: The Kaitchup

Instructions

uv pip install vllm
uv pip install git+https://github.com/huggingface/transformers.git

vllm serve [this model ID]  --max-model-len 262144 --reasoning-parser gemma4

Downloads last month: 103

Safetensors

Model size

2B params

Tensor type

F32

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kaitchup/gemma-4-31B-it-autoround-nvfp4-all

Base model

google/gemma-4-31B-it

Quantized

(164)

this model