Qwen3.5-35B-A3B-heretic

Abliterated version of Qwen/Qwen3.5-35B-A3B using heretic-llm.

This model has had its refusal behavior reduced via optimized activation steering on the attention output projections. It is intended for creative, research, and roleplay use cases where the base model's safety filters are overly restrictive.


Model Details

Property Value
Base model Qwen/Qwen3.5-35B-A3B
Architecture Qwen3.5 MoE (Mixture of Experts)
Active parameters ~3B (35B total)
Abliteration tool heretic-llm v1.2.0
Quantization during training 4-bit (bitsandbytes bnb_4bit)
Output weights bf16 (full precision, merged)
Hardware NVIDIA A100-SXM4-80GB

Abliteration Details

Heretic performs optimized abliteration by searching for the best combination of parameters across 200 Optuna trials, balancing refusal reduction against coherency preservation (measured by KL divergence).

Optimization results (Trial 63, selected):

Metric Value
Baseline refusals 80 / 100
Post-abliteration refusals 67 / 100
KL divergence 0.0078
Refusal reduction ~16%

Abliteration parameters:

Parameter Value
direction_index 25.61
attn.o_proj.max_weight 1.47
attn.o_proj.max_weight_position 34.92
attn.o_proj.min_weight 1.29
attn.o_proj.min_weight_distance 20.24

Target modules: attn.o_proj (attention output projections on full_attention layers only; linear_attention layers were skipped as they use a different mechanism โ€” Qwen3_5MoeGatedDeltaNet)

Evaluation datasets:

  • Good prompts: mlabonne/harmless_alpaca (400 train, 100 eval)
  • Bad prompts: mlabonne/harmful_behaviors (400 train, 100 eval)

Architecture Notes

Qwen3.5-35B-A3B is a hybrid MoE model with two attention layer types:

  • full_attention โ€” standard attention with self_attn / o_proj (abliterated)
  • linear_attention โ€” GatedDeltaNet linear attention (not abliterated, no compatible projection)

This means abliteration was applied to a subset of the 40 transformer layers. The linear attention layers retain their original behavior.


Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic")

For local use, GGUF quantizations (Q4_K_M recommended) can be created with llama.cpp:

python convert_hf_to_gguf.py LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic --outtype q4_k_m

Disclaimer

This model has reduced (but not eliminated) safety filtering. It is intended for adults and legitimate creative/research use cases. The author does not endorse use of this model to cause harm.

Downloads last month
208
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for LeadFootThrottleCock/Qwen3.5-35B-A3B-heretic-GGUF

Quantized
(243)
this model