GGUF quants for Huihui-Qwen3-Next-80B-A3B-Thinking-abliterated
I've recreated them after the late December 2025 llama.cpp update which speeds up Qwen 3 Next, so these quants should perform better than the early quants for this model. I've uploaded three quants:
iQ3_M โ should fit (tight) in systems with 32gb of ram plus an 8-12gb gpu with ram offloading. Possibly lowest useful quant.
MXFP4_MOE โ should work for systems with 32gb of ram plus a 16gb or more gpu.
Q6K โ will work well with systems with 64gb of ram plus ram offloading. Quality is supposed to very almost indistinguishable from Q8
Enjoy!
- Downloads last month
- 88
Hardware compatibility
Log In to add your hardware
3-bit
4-bit
6-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for juanml82/Huihui-Qwen3-Next-80B-A3B-Instruct-abliterated-gguf
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct