hope to be able to release nightmedia/Qwen3.5-122B-A10B-Text-qx86-hi-mlx

#1
by mimeng1990 - opened

Thank you so much for your team's dedication. I hope to be able to release nightmedia/Qwen3.5-122B-A10B-Text-qx86-hi-mlx, which is similar in size to limiair-qx86-hi-mlx and performs very well on a Mac with 128GB of RAM.

I am uploading a quant now--it seems a bit too big for the 128GB Mac

Please let me know if it actually works for you.

 61G	Qwen3.5-122B-A10B-Text-mxfp4-mlx
 76G	Qwen3.5-122B-A10B-Text-qx64-hi-mlx
105G	Qwen3.5-122B-A10B-Text-qx86-hi-mlx

A Mac with 128GB of RAM is well-suited for deploying models ranging from 93GB to 99GB. For example, limi-air-qx86-hi-mlx-mlx is 93.13GB, and both its size and performance are excellent. However, if Qwen3.5-122B-A10B-Text-qx86-hi-mlx is 105GB, it cannot be deployed locally. I’m wondering if there are any other solutions. On the other hand, Qwen3.5-122B-A10B-Text-qx64-hi-mlx is only 76GB, which feels like a waste of the 99GB of available RAM.

Those are good points, and the alternatives are:

  • dwq5--needs a lot of work to find out the best training point, and probably more RAM than I have now to do it
  • qx86--no hi. This would be group size 64 everywhere, instead of group size 32 on attention paths. This might fit

I will try and let you know

From all combinations I tried so far, qx85 seems to show the most promise at being stable in the think tag and otherwise

 86G	Qwen3.5-122B-A10B-Text-qx85-mlx

This should leave sufficient room for context

Test vibe went fine, model is stable and doesn't get stuck deciding stuff in the think tag. That seems to be an issue with some quants

I updated the qx64-hi with the latest layer formula from Qwen, and the model size increased to the point that now qx64-hi fits perfectly in a 128GB Mac

Sign up or log in to comment