Gemma 4 26B-A4B-IT GPTQ Int4

GPTQ INT4 quantization of google/gemma-4-26B-A4B-it with group_size=16 for TP=4 compatibility.

Key: group_size=16

Gemma 4 has intermediate_size=2112. With TP=4: 2112/4=528.

  • group_size=128: 528/128=4.125 (FAILS - not integer)
  • group_size=32: 528/32=16.5 (FAILS)
  • group_size=16: 528/16=33 (WORKS)

Model Details

Spec Value
Base Model google/gemma-4-26B-A4B-it
Architecture Gemma4ForConditionalGeneration (MoE)
Total Params 26B, Active: 3.8B/token
Experts 128 routed + 1 shared
Quantization GPTQ INT4, group_size=16, sym=True
Tool GPTQModel v6.1.0-dev
Calibration 128 samples, allenai/c4

Protected Layers (BF16)

  • Vision tower (191 layers)
  • Router gates (30 layers)
  • lm_head

All MLP (including shared) quantized INT4 uniformly.

Serving

Downloads last month
10,768
Safetensors
Model size
26B params
Tensor type
BF16
·
I32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for raydelossantos/gemma-4-26B-A4B-it-GPTQ-Int4

Quantized
(153)
this model