GGUF quants for Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated

I've recreated them after the late December 2025 llama.cpp update which speeds up Qwen 3 Next, so these quants should perform better than the early quants for this model. I've uploaded three quants:

iQ3_M โ€“ should fit (tight) in systems with 32gb of ram plus an 8-12gb gpu with ram offloading. Possibly lowest useful quant.

MXFP4_MOE โ€“ should work for systems with 32gb of ram plus a 16gb or more gpu.

Q6K โ€“ will work well with systems with 64gb of ram plus ram offloading. Quality is supposed to very almost indistinguishable from Q8

Enjoy!

Downloads last month
88
GGUF
Model size
80B params
Architecture
qwen3next
Hardware compatibility
Log In to add your hardware

3-bit

4-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for juanml82/Huihui-Qwen3-Next-80B-A3B-Instruct-abliterated-gguf