Request for AWQ Quant of Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-i1

by celikburak - opened 8 days ago

Hi cyankiwi,
I hope you're doing well.
Would you be able to create AWQ quantizations (especially 4-bit) for this model?
Model:
mradermacher/Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-i1-GGUF
It's a strong 40B reasoning + creative model (Claude Opus distilled + Deckard-Heretic tuning) and many people are interested in running it with vLLM. Currently only GGUF versions are available, and the GPTQ is quite heavy (~40GB) due to hybrid quantization.
AWQ versions would be greatly appreciated by the community.
Thank you in advance for your amazing work!
Best regards,

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment