Huihui-Qwen3.5-9B-abliterated GPTQ-Pro 4bit (g64)

This is a GPTQ-Pro 4-bit quantization of huihui-ai/Huihui-Qwen3.5-9B-abliterated.

It was quantized with group size 64 and evaluated against the original model on Wikitext-2 using a strided perplexity setup, plus KL and token-agreement checks.

Highlights

  • Base model: huihui-ai/Huihui-Qwen3.5-9B-abliterated
  • Quantization: GPTQ-Pro, 4-bit, group size 64
  • Calibration samples: 128
  • Quantization time: about 11.1 minutes
  • Quantized strided perplexity: 9.6579
  • Original strided perplexity: 9.5234
  • Perplexity degradation: 1.41%
  • Average KL divergence vs original: 0.03423
  • Top-1 agreement vs original: 91.96%
  • Top-5 agreement vs original: 99.98%

Quality Notes

This quantized build stays very close to the source model in language modeling quality.

  • Perplexity regression is small.
  • KL divergence is low.
  • Top-5 next-token agreement is effectively perfect.
  • In practice, this should preserve most of the original model's behavior while reducing memory use substantially.

Files

  • model-00001-of-00002.safetensors
  • model-00002-of-00002.safetensors
  • quantize_config.json
  • tokenizer and config files

Load With Transformers / GPTQModel

from gptqmodel import GPTQModel

model = GPTQModel.load(
    "groxaxo/Huihui-Qwen3.5-9B-abliterated-GPTQ-Pro-4bit-g64",
    device_map="auto",
    trust_remote_code=True,
)

Evaluation Summary

Measured locally:

  • Quantized strided PPL: 9.6579304371
  • Original strided PPL: 9.5233634665
  • Quantized chunked PPL: 11.6689118281
  • Original chunked PPL: 11.5080707440
  • KL divergence: 0.0342324856
  • Logit cosine similarity: 0.9935612157

Prompting

Use the same prompting and chat template behavior as the base model.

Disclaimer

This repo contains only the quantized checkpoint. Please review the base model card for intended use, limitations, and licensing details.

Downloads last month
667
Safetensors
Model size
9B params
Tensor type
BF16
·
I32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for groxaxo/Huihui-Qwen3.5-9B-abliterated-GPTQ-Pro-4bit-g64

Finetuned
Qwen/Qwen3.5-9B
Quantized
(11)
this model