Hope to see Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

#2
by jackalxyz - opened

This 27B is great but with all params activation is mostly too slow for most consumer grade GPU to run them, would be really nice if there is a Qwen3.5-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF version too. 🙂

Hi, I totally get your point.

However, the current toolchain isn’t fully mature yet. From my testing, the fine-tuned performance of Qwen3.5-35B-A3B hasn’t been very ideal so far.

There are still some issues like unstable routing and abnormal expert utilization. It seems that SFT and MoE architectures still have some structural incompatibilities at this stage.

Hopefully this will improve as the ecosystem evolves.😔

YES, I ALSO FOUD 35B IS SO FAST, Because it is MOE model maybe , I deploy gguf version is very fast than 27B gguf

Sign up or log in to comment