hope to be able to release nightmedia/Qwen3.5-122B-A10B-Text-qx86-hi-mlx
Thank you so much for your team's dedication. I hope to be able to release nightmedia/Qwen3.5-122B-A10B-Text-qx86-hi-mlx, which is similar in size to limiair-qx86-hi-mlx and performs very well on a Mac with 128GB of RAM.
I am uploading a quant now--it seems a bit too big for the 128GB Mac
Please let me know if it actually works for you.
61G Qwen3.5-122B-A10B-Text-mxfp4-mlx
76G Qwen3.5-122B-A10B-Text-qx64-hi-mlx
105G Qwen3.5-122B-A10B-Text-qx86-hi-mlx
A Mac with 128GB of RAM is well-suited for deploying models ranging from 93GB to 99GB. For example, limi-air-qx86-hi-mlx-mlx is 93.13GB, and both its size and performance are excellent. However, if Qwen3.5-122B-A10B-Text-qx86-hi-mlx is 105GB, it cannot be deployed locally. I’m wondering if there are any other solutions. On the other hand, Qwen3.5-122B-A10B-Text-qx64-hi-mlx is only 76GB, which feels like a waste of the 99GB of available RAM.
Those are good points, and the alternatives are:
- dwq5--needs a lot of work to find out the best training point, and probably more RAM than I have now to do it
- qx86--no hi. This would be group size 64 everywhere, instead of group size 32 on attention paths. This might fit
I will try and let you know
From all combinations I tried so far, qx85 seems to show the most promise at being stable in the think tag and otherwise
86G Qwen3.5-122B-A10B-Text-qx85-mlx
This should leave sufficient room for context
Test vibe went fine, model is stable and doesn't get stuck deciding stuff in the think tag. That seems to be an issue with some quants
I updated the qx64-hi with the latest layer formula from Qwen, and the model size increased to the point that now qx64-hi fits perfectly in a 128GB Mac