thumbnail

Gemma4-26b-Super-Abliterated-TQ3_4S

Gemma4-26b-Super-Abliterated-TQ3_4S is a GGUF quantization of Jiunsong/supergemma4-26b-abliterated-multimodal using TQ3_4S, a 4.0 bpw Walsh-Hadamard-transform weight format with four per-8 scales.

Files

  • Gemma4-26b-Super-Abliterated-TQ3_4S.gguf (12.0 GiB)
  • mmproj-f16.gguf (1.2 GiB)
  • chat_template.jinja

Runtime Requirement

This model requires the public TurboQuant runtime fork:

It will not load correctly on stock llama.cpp or other runtimes that do not include TQ3_4S.

Text-Only Run

./build/bin/llama-server \
  -m /path/to/Gemma4-26b-Super-Abliterated-TQ3_4S.gguf \
  -ngl 99 -c 4096 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --chat-template-file /path/to/chat_template.jinja

Vision / Image Input

./build/bin/llama-server \
  -m /path/to/Gemma4-26b-Super-Abliterated-TQ3_4S.gguf \
  -mm /path/to/mmproj-f16.gguf \
  -ngl 99 -c 4096 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --cache-ram 0 --no-warmup --jinja \
  --chat-template-file /path/to/chat_template.jinja \
  --no-mmproj-offload

Performance (RTX 5060 Ti 16GB)

Metric Value
PP512 2154 tok/s
TG128 91.3 tok/s
Size 12.01 GiB
BPW 4.09

Quality

10/10 correct on standard QA benchmark (capital of France, 2+2, Python reverse string, gravity, WW2, primes, boiling point, Shakespeare, Jupiter, hello→Hola).

Base Model

Credits

  • Jiunsong for the SuperGemma4 abliterated multimodal model
  • Google DeepMind for Gemma 4
  • huihui-ai for the abliteration technique

License

Same license terms as the base model apply (Gemma license).

Downloads last month
1,282
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for YTan2000/Gemma4-26b-Super-Abliterated-TQ3_4S

Quantized
(10)
this model