Made catplusplus/Qwen3.5-35B-A3B-heretic-NVFP4 for Blackwell users

#3
by catplusplus - opened

Works well on my NVIDIA Thor Dev Kit, should also be good for DGX Spark and new consumer cards, just build latest vLLM + dependencies from source as it's a new arch and pip releases won't have support.

https://huggingface.co/catplusplus/Qwen3.5-35B-A3B-heretic-NVFP4
Anyone has a setup to make heretic+NVFP4 for Qwen3.5-122B-A10B? Or a Dockerfile that sets up patched heretic + llmcompressor and autouploads result to hugging face / shuts down runpod to save money once done for me to run and share the resulting model for everyone? I might get to it.... eventually, I am just an Android developer by day.

Sign up or log in to comment