Auto-Quantized GGUF Model

This repository contains automated GGUF quantization files for huihui-ai/Huihui-Qwen3.5-9B-Claude-4.6-Opus-abliterated.

Q5_K_M: Moderately compressed and effectively leverages the imatrix, making it nearly indistinguishable from the original precision. (recommended)
IQ4_XS: Compressed to the minimum footprint with only a slight degradation in quality.
The calibration data for the imatrix is targeted at Chinese novels and role-playing (RP), while preserving logic and common sense.

Q5_K_M: 适度压缩且发挥了imatrix的作用,难以察觉到与原精度的区别。(推荐)
IQ4_XS: 在只有轻微质量下降的情况下,压缩到最小占用。
imatrix 的校准数据以中文的小说、角色扮演为目标,同时保留逻辑和常识。

📊 Perplexity Evaluation

(Tested against the provided calibration dataset)

  • Base (F16/BF16): PPL = 16.4336 +/- 0.14178
  • Q5_K_M: PPL = 14.6771 +/- 0.12020
  • IQ4_XS: PPL = 14.9560 +/- 0.12308
Downloads last month
444
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nuofang/Huihui-Qwen3.5-9B-Claude-4.6-Opus-abliterated-GGUF