Made NVFP4 version
#3
by catplusplus - opened
https://huggingface.co/catplusplus/Qwen3.5-35B-A3B-heretic-v2-NVFP4, good for new consumer GPU cards, DGX Spark or Thor Dev Kit. Running with FP8 kv cache in vLLM built from git, latest stable release doesn't yet have model support.