Qwen3.5 MLX — Apple Silicon Optimized
Collection
4 items • Updated
GGUF quantization of Qwen/Qwen3.5-122B-A10B optimized for local inference with llama.cpp.
llama-cli \
-m Qwen3.5-122B-A10B-Q6_K.gguf \
-p "Your prompt here" \
-n 200
Q6_K preserves more of the original model quality at the cost of larger file size (~95GB vs 69GB). Use this if you have sufficient RAM and prioritize output quality over speed. For a smaller footprint, see Q4_K_M version.
6-bit
Base model
Qwen/Qwen3.5-122B-A10B