This is google/gemma-4-31B-it quantized with AutoRound to NVFP4. The model is compatible with vLLM (tested: v0.19). Tested with an RTX Pro 6000. Currently under evaluation.
- Developed by: The Kaitchup
Instructions
uv pip install vllm
uv pip install git+https://github.com/huggingface/transformers.git
vllm serve [this model ID] --max-model-len 262144 --reasoning-parser gemma4
- Downloads last month
- 103
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support
Model tree for kaitchup/gemma-4-31B-it-autoround-nvfp4-all
Base model
google/gemma-4-31B-it