Qwen3.5-35B-A3B-heretic

Abliterated version of Qwen/Qwen3.5-35B-A3B using heretic-llm.

This model has had its refusal behavior reduced via optimized activation steering on the attention output projections. It is intended for creative, research, and roleplay use cases where the base model's safety filters are overly restrictive.

Model Details

Property	Value
Base model	Qwen/Qwen3.5-35B-A3B
Architecture	Qwen3.5 MoE (Mixture of Experts)
Active parameters	~3B (35B total)
Abliteration tool	heretic-llm v1.2.0
Quantization during training	4-bit (bitsandbytes bnb_4bit)
Output weights	bf16 (full precision, merged)
Hardware	NVIDIA A100-SXM4-80GB

Abliteration Details

Heretic performs optimized abliteration by searching for the best combination of parameters across 200 Optuna trials, balancing refusal reduction against coherency preservation (measured by KL divergence).

Optimization results (Trial 63, selected):

Metric	Value
Baseline refusals	80 / 100
Post-abliteration refusals	67 / 100
KL divergence	0.0078
Refusal reduction	~16%

Abliteration parameters:

Parameter	Value
`direction_index`	25.61
`attn.o_proj.max_weight`	1.47
`attn.o_proj.max_weight_position`	34.92
`attn.o_proj.min_weight`	1.29
`attn.o_proj.min_weight_distance`	20.24

Target modules: attn.o_proj (attention output projections on full_attention layers only; linear_attention layers were skipped as they use a different mechanism — Qwen3_5MoeGatedDeltaNet)

Evaluation datasets:

Good prompts: mlabonne/harmless_alpaca (400 train, 100 eval)
Bad prompts: mlabonne/harmful_behaviors (400 train, 100 eval)

Architecture Notes

Qwen3.5-35B-A3B is a hybrid MoE model with two attention layer types:

full_attention — standard attention with self_attn / o_proj (abliterated)
linear_attention — GatedDeltaNet linear attention (not abliterated, no compatible projection)

This means abliteration was applied to a subset of the 40 transformer layers. The linear attention layers retain their original behavior.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic")

For local use, GGUF quantizations (Q4_K_M recommended) can be created with llama.cpp:

python convert_hf_to_gguf.py LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic --outtype q4_k_m

Disclaimer

This model has reduced (but not eliminated) safety filtering. It is intended for adults and legitimate creative/research use cases. The author does not endorse use of this model to cause harm.

Downloads last month: 208

GGUF

Model size

35B params

Architecture

qwen35moe

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic-GGUF

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen3.5-35B-A3B

Quantized

(243)

this model