exllamav3 quantizations of Qwen/Qwen3.5-397B-A17B. Quantized using commit 144d826 of the dev branch (version was bumped to v0.2.23 in the next commit).
Optimized
2.08bpw_h6 99.272 GiB
Straight
2.00bpw_h6 96.614 GiB
Deleted the 3, 4 and 5bpw quants to save space. Another user has uploaded these here: https://huggingface.co/NeuroSenko/Qwen3.5-397B-A17B-exl3
Matplotlib Catbench
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for MikeRoz/Qwen3.5-397B-A17B-exl3
Base model
Qwen/Qwen3.5-397B-A17B
