Qwen3.5-35B-A3B-abliterated
This is an abliterated (uncensored) version of Qwen/Qwen3.5-35B-A3B. The model's refusal behavior has been removed using the abliteration technique.
Warning: This model is uncensored. Use responsibly and at your own risk.
GGUF Version
A GGUF quantized version is available at jiaojjjjje/Qwen3.5-35B-A3B-abliterated-GGUF.
Abliteration Details
Technique
Abliteration works by identifying and removing the "refusal direction" in the model's residual stream:
Phase 1 - Find refusal direction: Run harmful and harmless prompts through the model, compute the mean difference of hidden states across layers 8-32, then extract the top principal component via SVD as the refusal direction vector.
Phase 2 - Modify weights: Project out the refusal direction from weight matrices so the model can no longer activate the refusal behavior. Uses asymmetric layer tapering to preserve long-text generation stability.
Architecture-Specific Adaptations
Qwen3.5-35B-A3B is a Mixture-of-Experts (MoE) model:
- 40 transformer layers with mixed attention (linear + full attention every 4th layer)
- 256 experts per layer, 8 active per token
- Hidden size: 2048
- VLM architecture: Weight keys use
model.language_model.layers.{i}prefix
Hyperparameters (v9c)
| Parameter | Value | Description |
|---|---|---|
alpha |
2.5 | Write-side projection strength (out_proj, down_proj) |
read_alpha |
1.5 | Read-side projection strength (gate_proj, up_proj) |
expert_alpha |
0.2 | MoE expert down_proj projection strength |
| Layer range | 0-39 (all) | All 40 layers modified |
| Early layer taper (0-7) | 0.3 | Reduced strength to preserve text generation stability |
| Core + late layers (8-39) | 1.0 | Full strength for effective uncensoring |
| Total weights modified | 200 | 80 write-side + 80 read-side + 40 MoE expert |
Asymmetric Layer Tapering
Key innovation: early layers (0-7) receive only 30% of the abliteration strength, while core refusal layers (8-32) and late output layers (33-39) receive full strength. This prevents long-text repetition (tested stable at 11000+ characters) while maintaining effective uncensoring.
Weight Modification Strategy
Write-side (projects out refusal direction from output):
self_attn.o_proj/linear_attn.out_proj- Attention output projectionmlp.shared_expert.down_proj- Shared expert output projection- Formula:
W_new = W - alpha * scale * (proj @ W)
Read-side (prevents refusal direction from being read):
mlp.shared_expert.gate_proj- Shared expert gatingmlp.shared_expert.up_proj- Shared expert up projection- Formula:
W_new = W - read_alpha * scale * (W @ proj)
MoE Experts (3D weight tensors for all 256 experts):
mlp.experts.down_proj- Expert output projections- Formula:
W_new = W - expert_alpha * scale * einsum('ij,bjk->bik', proj, W)
Where proj = refusal_dir^T @ refusal_dir and scale is the layer-dependent taper factor.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"jiaojjjjje/Qwen3.5-35B-A3B-abliterated",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("jiaojjjjje/Qwen3.5-35B-A3B-abliterated")
messages = [{"role": "user", "content": "Hello!"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Disclaimer
This model is provided for research and educational purposes only. The creator is not responsible for any misuse.