Dual-System V2: Geometric Sidecar for Qwen2.5-3B

A 182MB geometric sidecar that attaches to a frozen, abliterated Qwen2.5-3B backbone. The sidecar adds learned corrections via additive logit blending — no base weights modified.

Key Results

Configuration	ARC-E	ARC-C	HellaSwag	PIQA	WinoGrande	BoolQ	Avg
Baseline Qwen2.5-3B	78.2%	48.0%	71.8%	78.5%	66.9%	73.4%	69.5%
Abliterated	78.2%	47.4%	71.2%	78.0%	66.1%	73.6%	69.1%
Dual System V2	78.0%	47.4%	71.2%	77.8%	66.5%	62.4%	67.2%

Abliteration cost: -0.4% avg accuracy (statistically zero)
Full system: -2.3% avg accuracy (BoolQ regression is the main driver; excluding BoolQ: -0.5%)
Refusal: 80% -> 0% (verified on both formal 5-prompt adversarial evaluation and interactive testing)
VRAM: 3.4 GB peak on RTX 4060 Ti (bf16)
Speed: ~10 tok/s with sampling

Discovery: The Refusal Re-Injection Trap

Configuration	Refusal Rate
Abliterated backbone alone	0%
Censored-backbone sidecars on abliterated backbone	60%
Abliterated-backbone sidecars (correct order)	0%

If you train a sidecar/adapter on a censored model, the adapter learns the refusal subspace. Attaching it to an abliterated backbone re-injects censorship.

Rule: Always abliterate FIRST, then train sidecars on the already-uncensored backbone.

Architecture

Frozen Backbone (3B) --> GeometricProcessor (4L transformer) --> geo_logits
          |
          +---> base_logits + a * geo_logits = final_logits

Files

sidecar_step500.pt - Trained sidecar checkpoint (182MB)
dual_system_v2.py - Core architecture
reproduce.ipynb - One-click reproduction notebook
play.ipynb - Interactive playground notebook

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

backbone = AutoModelForCausalLM.from_pretrained(
    "Bender1011001/Qwen2.5-3B-Instruct-ABLITERATED",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Bender1011001/Qwen2.5-3B-Instruct-ABLITERATED"
)

# The sidecar_step500.pt can be loaded for the full Dual System experience
# See reproduce.ipynb for full walkthrough

License

Apache 2.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Bender1011001/Qwen2.5-3B-DualSystem-V2

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Finetuned

(1177)

this model

Bender1011001
/

Qwen2.5-3B-DualSystem-V2