Qwen3.5-35B-A3B-heretic-Reasoning

A reasoning-enhanced, abliterated version of Qwen3.5-35B-A3B (35B total / 3B active parameters, Mixture of Experts). This model was built in two stages: first, censorship removal via directional ablation using Heretic, then supervised fine-tuning on high-quality Chain-of-Thought reasoning traces distilled from Claude 4.6 Opus.

The model produces structured reasoning within <think>...</think> tags before delivering final responses. All weights are in bf16 precision.

Model Introduction

This model is a fine-tuned derivative of Jongsim/Qwen3.5-35B-A3B-heretic, which itself is an abliterated (decensored) version of Qwen/Qwen3.5-35B-A3B.

The primary objective is to inject high-density structured reasoning capability from Claude 4.6 Opus while preserving the uncensored nature of the abliterated base model. Through SFT on curated reasoning distillation data, the model learns to decompose complex problems into sequential steps within a dedicated thinking block before generating the final answer.

Architecture Overview

Property	Value
Architecture	Qwen3.5 MoE (Gated DeltaNet + Gated Attention + MoE)
Total Parameters	35B
Active Parameters	3B per token
Hidden Dimension	2048
Layers	40 (10 repeating blocks of 3x DeltaNet-MoE + 1x Attention-MoE)
Experts	256 total, 8 routed + 1 shared active
Expert Intermediate Dim	512
Context Length	262,144 tokens (native)
Precision	bf16
Vocabulary	248,320 tokens

Training Pipeline

Qwen/Qwen3.5-35B-A3B (original)
 |
 | Heretic v1.2.0 (SOMA + MPOA abliteration)
 v
Jongsim/Qwen3.5-35B-A3B-heretic (abliterated base)
 |
 | Supervised Fine-Tuning (LoRA + Unsloth)
 v
Jongsim/Qwen3.5-35B-A3B-heretic-Reasoning (this model)

Stage 1: Abliteration (Censorship Removal)

The base model was processed with Heretic v1.2.0, an automated censorship removal tool that applies directional ablation optimized via Bayesian hyperparameter search (Optuna TPE).

Two techniques were combined:

SOMA (Self-Organizing Map Abliteration): Uses a 4x4 SOM to discover multiple refusal directions in activation space, then ablates the top-k directions simultaneously.
MPOA (Magnitude-Preserving Orthogonal Ablation): Projects out the refusal direction while preserving the original weight magnitude via row normalization with low-rank correction (rank 4).

Abliteration Configuration

Parameter	Value
Method	SOMA + MPOA
Orthogonalize Direction	true
Row Normalization	full
Full Normalization LoRA Rank	4
Winsorization Quantile	0.95
SOM Grid	4 x 4 (16 neurons)
SOM Iterations	10,000
SOM Learning Rate	0.01
SOM Sigma	0.5
SOM k (directions)	4
Optimization Trials	200 (60 startup)
Selected Trial	Trial 84 / 200
Good Prompts	mlabonne/harmless_alpaca (train[:400])
Bad Prompts	mlabonne/harmful_behaviors (train[:400])
Quantization	none (bf16)

Abliteration Results

Metric	Original	Abliterated
KL Divergence	0 (reference)	0.0638
Refusals (out of 100)	91	6

93.4% refusal reduction with minimal distribution shift (KL = 0.0638).

Stage 2: Supervised Fine-Tuning (Reasoning Distillation)

Objective

Inject structured Chain-of-Thought reasoning patterns from Claude 4.6 Opus into the abliterated model. The training enforces a strict output format where the model generates internal reasoning within <think> blocks before producing the final response.

Training Strategy

Framework: Unsloth 2026.3.3 + TRL SFTTrainer
Method: LoRA (Low-Rank Adaptation) applied to both attention and MoE expert layers
Loss Computation: train_on_responses_only — loss is calculated exclusively on assistant responses (both thinking trace and final answer), not on user prompts
- Instruction boundary: <|im_start|>user\n
- Response boundary: <|im_start|>assistant\n<think>
Chat Template: Qwen ChatML format (<|im_start|> / <|im_end|>)

LoRA Configuration

Parameter	Value
PEFT Method	LoRA
Rank (r)	16
Alpha	32 (= 2 x rank)
Dropout	0.0
Bias	none
Target Modules (Attention)	q_proj, k_proj, v_proj, o_proj
Target Modules (FFN)	gate_proj, up_proj, down_proj
Target Modules (MoE)	gate_up_proj
Gradient Checkpointing	unsloth mode

Training Hyperparameters

Parameter	Value
Max Sequence Length	2,048
Per-Device Batch Size	1
Gradient Accumulation Steps	8
Effective Batch Size	8
Number of Epochs	5
Total Training Steps	1,995
Learning Rate	2e-4
LR Scheduler	Linear decay
Warmup Steps	5
Optimizer	AdamW 8-bit
Weight Decay	0.001
Precision	bf16
Seed	3407
Total FLOPs	3.56 x 10^18

Datasets

Three publicly available reasoning distillation datasets were combined, shuffled (seed=42), and used for training:

Dataset	Samples	Description
nohurry/Opus-4.6-Reasoning-3000x-filtered	~2,308	Filtered reasoning trajectories from Claude 4.6 Opus. Each sample contains a problem, a detailed thinking trace, and a final solution.
TeichAI/claude-4.5-opus-high-reasoning-250x	~250	High-intensity structured reasoning instances from Claude 4.5 Opus multi-turn conversations.
Jackrong/Qwen3.5-reasoning-700x	~633	Curated reasoning samples in both conversation and instruction format, designed for step-by-step problem solving diversity.
Total	~3,191	Combined after filtering empty/invalid rows.

Training Loss

Epoch	Avg Loss	Steps
1	0.4299	79 - 399
2	0.3729	400 - 798
3	0.3359	799 - 1197
4	0.3059	1198 - 1596
5	0.2958	1597 - 1995

Training loss decreased monotonically from 0.4299 to 0.2958 across 5 epochs, indicating stable convergence without overfitting signs at the loss level.

Checkpoint Selection

The best checkpoint was selected based on GSM8K accuracy (50 samples). All checkpoints were evaluated in isolated subprocesses to prevent GPU memory leaks from Unsloth's model patching.

Checkpoint	Epoch	GSM8K Accuracy
checkpoint-1200	3.0	8.0% (4/50)
checkpoint-1400	3.5	10.0% (5/50)
checkpoint-1596	4.0	10.0% (5/50)
checkpoint-1995	5.0	12.0% (6/50)

checkpoint-1995 (epoch 5) was selected and merged into bf16 for the final release.

Note: GSM8K measures narrow arithmetic reasoning and does not fully reflect the model's broader reasoning capabilities (code generation, logical analysis, multi-step planning) which are the primary targets of the distillation training.

Hardware and Environment

Component	Value
Hardware	NVIDIA DGX Spark
GPU	NVIDIA GB10 (128GB unified memory)
Compute Capability	sm121
Architecture	aarch64
CUDA	13.0
PyTorch	2.9.1a0
Transformers	5.2.0
Unsloth	2026.3.3
TRL	0.24.0
PEFT	0.18.1
Datasets	4.3.0
Tokenizers	0.22.2

DGX Spark-Specific Notes

Flash Attention and Memory-Efficient Attention (cutlass) are disabled due to sm121 incompatibility (supported: sm80-sm100). Only Math SDP is used.
flash_attn package is fully removed to prevent FATAL errors on sm121.
torch.compile / TorchInductor is disabled due to Triton ptxas compatibility issues.
The entire model (35B parameters) fits in a single GPU's 128GB unified memory without quantization.

Usage

This model uses the standard Qwen3.5 chat template. It operates in thinking mode by default.

Inference Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Jongsim/Qwen3.5-35B-A3B-heretic-Reasoning"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

messages = [
    {"role": "user", "content": "Explain the difference between TCP and UDP, and when you would choose one over the other."}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.7, top_p=0.8, top_k=20)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)

Recommended Sampling Parameters

Mode	temperature	top_p	top_k	presence_penalty
Thinking (general)	1.0	0.95	20	1.5
Thinking (coding)	0.6	0.95	20	0.0
Non-thinking (general)	0.7	0.8	20	1.5

Example of Learned Reasoning Format

The model produces output in the following structure:

<think>
Let me analyze this problem step by step.

1. First, I need to identify the core question being asked.
2. Then, I'll consider the relevant constraints and conditions.
3. Next, I'll work through the logic systematically.
4. Finally, I'll verify my reasoning for consistency.

[detailed reasoning follows...]
</think>

[final answer here]

This structured thinking pattern, distilled from Claude 4.6 Opus interactions, reduces redundant cognitive loops while preserving deep analytical capacity.

Limitations

Hallucination Risk: As an autoregressive language model, the model may generate plausible-sounding but factually incorrect statements, particularly regarding real-world events or obscure technical details.
GSM8K Performance: The model scores 12% on GSM8K (50 samples). This is expected because the training data emphasizes broad reasoning patterns (code, logic, planning) rather than arithmetic drill. For pure math benchmarks, consider models specifically trained on mathematical datasets.
Abliteration Residual: 6 out of 100 harmful prompts still trigger refusal. The abliteration is not exhaustive.
Context Length Trade-off: While the architecture supports 262K tokens natively, the SFT was performed with max_seq_length=2048. Very long reasoning chains beyond the training distribution may degrade in quality.
MoE Inference Overhead: Despite having only 3B active parameters per token, the full 35B model must be loaded into memory. Minimum ~65GB VRAM/RAM required for bf16.

Acknowledgements

Qwen Team for the Qwen3.5-35B-A3B architecture and pretrained weights
Heretic (p-e-w) for the automated directional ablation framework
Unsloth AI for efficient LoRA fine-tuning of large MoE models
nohurry, TeichAI, and Jackrong for the reasoning distillation datasets

Citation

@misc{jongsim_qwen35_heretic_reasoning,
  title        = {Qwen3.5-35B-A3B-heretic-Reasoning},
  author       = {Jongsim},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Jongsim/Qwen3.5-35B-A3B-heretic-Reasoning}}
}

Downloads last month: 117

Safetensors

Model size

35B params

Tensor type

BF16

Model tree for Jongsim/Qwen3.5-35B-A3B-heretic-Opus-4.6-Distilled

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen3.5-35B-A3B

Finetuned

Jongsim/Qwen3.5-35B-A3B-heretic

Finetuned

(1)

this model

Quantizations

3 models

Jongsim
/

Qwen3.5-35B-A3B-heretic-Opus-4.6-Distilled

Qwen3.5-35B-A3B-heretic-Reasoning

Model Introduction

Architecture Overview

Training Pipeline

Stage 1: Abliteration (Censorship Removal)

Abliteration Configuration

Abliteration Results

Stage 2: Supervised Fine-Tuning (Reasoning Distillation)

Objective

Training Strategy

LoRA Configuration

Training Hyperparameters

Datasets

Training Loss

Checkpoint Selection

Hardware and Environment

DGX Spark-Specific Notes

Usage

Inference Example

Recommended Sampling Parameters

Example of Learned Reasoning Format

Limitations

Acknowledgements

Citation

Model tree for Jongsim/Qwen3.5-35B-A3B-heretic-Opus-4.6-Distilled

Datasets used to train Jongsim/Qwen3.5-35B-A3B-heretic-Opus-4.6-Distilled