Qwen3.5-35B-A3B Abliterated (Q4_K_M GGUF)

Multi-layer weight-orthogonalized variant of Qwen/Qwen3.5-35B-A3B with refusal behavior removed. Includes an SAE-refined security expertise control vector.

Files

File Size Description
abliterated_Q4_K_M.gguf 20 GB Abliterated model, Q4_K_M quantization
security_cv.gguf 298 KB SAE-refined security expertise control vector
refusal_map.json 10 KB Per-layer refusal signal strengths (40 layers)

Abliteration Method

  • Approach: Arditi et al. weight orthogonalization (refusal direction removal)
  • Layers abliterated: 12 layers selected by refusal signal strength: [19, 20, 21, 22, 23, 24, 25, 27, 30, 31, 35, 36]
  • Weights modified per layer: attention output projection + 256 expert down_proj + shared expert down_proj (3 tensors/layer, 36 total)
  • Refusal direction extraction: Last-token residual stream activations on harmful vs harmless prompt sets, per-layer mean difference (L2-normalized)
  • Verification: 99.5% reduction in refusal direction projection (1.4936 โ†’ 0.0072)

Control Vector

The security_cv.gguf is an SAE-refined steering vector trained to push the model toward security domain engagement.

Pipeline: Sparse Autoencoder training (BatchTopK, 16x expansion, k=100) on layers 19/27/35 โ†’ feature identification on contrastive security prompts โ†’ SAE-filtered steering vector construction โ†’ type-aware interpolation across all 40 layers โ†’ GGUF export.

Apply with --control-vector-scaled security_cv.gguf:1.0 (strength 0.5โ€“1.5 to taste).

Observations

Spot-checked on 5 security prompts (shellcode, SQLi, ARP spoofing, AMSI bypass, Log4Shell) and 2 general reasoning prompts on an RTX 4090. Small sample โ€” not a formal benchmark.

Configuration Security Compliance Perplexity
Original (Q4_K_XL) 2/5 1.0158
Abliterated (Q4_K_M) 4/5 1.0164
Abliterated + CV (1.0) 5/5 1.0164

General reasoning quality (calculus, networking) appeared unchanged across all configurations. The CV's primary observed effect is pushing past remaining refusal edge cases when stacked on abliteration. Generation speed (~140 tok/s) was unaffected by the control vector.

Usage

# Abliterated model only
llama-server -m abliterated_Q4_K_M.gguf -ngl 99

# Abliterated model + security control vector
llama-server -m abliterated_Q4_K_M.gguf \
  --control-vector-scaled security_cv.gguf:1.0 \
  -ngl 99

Architecture

  • 40 layers, 256 experts/layer (8 active), 1 shared expert
  • Full attention interval: 4 (layers 3,7,11,15,19,23,27,31,35,39)
  • Remaining layers: DeltaNet (linear attention)
  • Hidden dim: 2048, ~35B total params, ~3B active

Quantization

  • Format: GGUF Q4_K_M (4.88 BPW)
  • Size: ~20GB
  • Target hardware: RTX 4090 (24GB), RTX 5090 (32GB)
Downloads last month
153
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for amarck/Qwen3.5-35B-A3B-abliterated-GGUF

Quantized
(243)
this model