Qwen3.5-35B-A3B-Uncensored-Aggressive-NVFP4

NVFP4 quantized version of Li101/Qwen3.5-35B-A3B-Uncensored-Aggressive-safetensors, which is the "Aggressive" uncensoring variant of Qwen/Qwen3.5-35B-A3B. Includes full vision encoder weights (BF16, unquantized).

Quantization Details

Detail Value
Method NVFP4 (compressed-tensors)
Model size ~23 GB (vs 67 GB BF16)
Language model NVFP4 quantized
Visual encoder BF16 (unquantized, 333 tensors, 0.89 GB)
Excluded from quantization lm_head, MoE gates, shared expert gates, linear attention layers, visual encoder

Calibration

Usage with vLLM

Please use nightly vLLM for support.

vllm serve Li101/Qwen3.5-35B-A3B-Uncensored-Aggressive-NVFP4 \
    --kv-cache-dtype fp8 \
    --reasoning-parser qwen3 \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder

Specs

Same architecture and capabilities as Qwen/Qwen3.5-35B-A3B, including vision (image/video understanding).

Credits

  • Uncensoring by HauhauCS
  • Base model by Qwen
  • NVFP4 quantization and deployment by Li101
Downloads last month
664
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Li101/Qwen3.5-35B-A3B-Uncensored-Aggressive-NVFP4