Huihui-Qwen3.5-9B-abliterated GPTQ-Pro 4bit (g64)

This is a GPTQ-Pro 4-bit quantization of huihui-ai/Huihui-Qwen3.5-9B-abliterated.

It was quantized with group size 64 and evaluated against the original model on Wikitext-2 using a strided perplexity setup, plus KL and token-agreement checks.

Highlights

Base model: huihui-ai/Huihui-Qwen3.5-9B-abliterated
Quantization: GPTQ-Pro, 4-bit, group size 64
Calibration samples: 128
Quantization time: about 11.1 minutes
Quantized strided perplexity: 9.6579
Original strided perplexity: 9.5234
Perplexity degradation: 1.41%
Average KL divergence vs original: 0.03423
Top-1 agreement vs original: 91.96%
Top-5 agreement vs original: 99.98%

Quality Notes

This quantized build stays very close to the source model in language modeling quality.

Perplexity regression is small.
KL divergence is low.
Top-5 next-token agreement is effectively perfect.
In practice, this should preserve most of the original model's behavior while reducing memory use substantially.

Files

model-00001-of-00002.safetensors
model-00002-of-00002.safetensors
quantize_config.json
tokenizer and config files

Load With Transformers / GPTQModel

from gptqmodel import GPTQModel

model = GPTQModel.load(
    "groxaxo/Huihui-Qwen3.5-9B-abliterated-GPTQ-Pro-4bit-g64",
    device_map="auto",
    trust_remote_code=True,
)

Evaluation Summary

Measured locally:

Quantized strided PPL: 9.6579304371
Original strided PPL: 9.5233634665
Quantized chunked PPL: 11.6689118281
Original chunked PPL: 11.5080707440
KL divergence: 0.0342324856
Logit cosine similarity: 0.9935612157

Prompting

Use the same prompting and chat template behavior as the base model.

Disclaimer

This repo contains only the quantized checkpoint. Please review the base model card for intended use, limitations, and licensing details.

Downloads last month: 667

Safetensors

Model size

9B params

Tensor type

BF16

I32

Model tree for groxaxo/Huihui-Qwen3.5-9B-abliterated-GPTQ-Pro-4bit-g64

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

huihui-ai/Huihui-Qwen3.5-9B-abliterated

Quantized

(11)

this model