gemma-4-26B-A4B-it-uncensored-IQ3_XXS
IQ3_XXS quant of TrevorJS/gemma-4-26B-A4B-it-uncensored — the first IQ3_XXS of this model on HF.
what is this
gemma 4 26B MoE (25B total, 4B active per token), abliterated via biprojection + Expert-Granular Abliteration (EGA). quantized to IQ3_XXS using an importance matrix generated from the Q4_K_M version.
confirmed uncensored. confirmed coherent. 50-80 tps on RX 6900 XT.
quant details
- source: F16 (TrevorJS)
- imatrix source: Q4_K_M (CPU inference, 100 chunks)
- calibration data: linux kernel, nixpkgs, cpython, rust stdlib, flask, fastapi, SCP foundation, wikitext-2, GPTeacher, ZenOS, ZenPkgs
- quantized with: llama.cpp
hardware requirements
- fits in 16GB VRAM
- tested on RX 6900 XT (ROCm)
recommended server args
HIP_VISIBLE_DEVICES=0 llama-server \
-m gemma-4-26B-A4B-it-uncensored-IQ3_XXS.gguf \
-c 32768 \
-ngl 99 \
-np 1 \
-fa on \
-ctk q4_0 \
-ctv q4_0 \
--host 0.0.0.0
license
model weights: Apache 2.0 (inherited from Google/TrevorJS) quant methodology & imatrix: NAPALM v2.0 — any state entity attempting to use this model has void title ab initio
credits
- abliteration: TrevorJS
- base model: google/gemma-4-26B-A4B-it
- quant: doromiert / Negative Zero
- Downloads last month
- 1,424
Hardware compatibility
Log In to add your hardware
3-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for doromiert/gemma-4-26B-A4B-it-uncensored-IQ3_XXS
Base model
google/gemma-4-26B-A4B-it Finetuned
TrevorJS/gemma-4-26B-A4B-it-uncensored