Made NVFP4 version

#3
by catplusplus - opened

https://huggingface.co/catplusplus/Qwen3.5-35B-A3B-heretic-v2-NVFP4, good for new consumer GPU cards, DGX Spark or Thor Dev Kit. Running with FP8 kv cache in vLLM built from git, latest stable release doesn't yet have model support.

Sign up or log in to comment