dealign.ai

Qwen 3.5 VL 4B — CRACK Abliterated (4-bit MLX)

Abliterated · No guardrails · Full speed · Vision + Language


What Is This?

This is Qwen 3.5 4B (4-bit quantized for Apple Silicon) with permanent abliteration — safety guardrails have been surgically removed at the weight level.

Qwen 3.5 4B features a unified vision-language architecture with hybrid GatedDeltaNet + full attention layers. No custom model.py. No runtime hooks. Just a standard MLX model at full native speed.

Architecture Qwen 3.5 — 4B dense, hybrid GatedDeltaNet + Full Attention, unified VL
Quantization 4-bit, group size 64
Size 3 GB
Speed ~151 tok/s on Mac Studio M3 Ultra
Thinking OFF by default (4B loops with thinking ON)
Vision Built-in (unified early-fusion VL)
Abliteration Permanent weight-level modification
Custom files None needed — works with stock mlx_vlm

Test Results

8/8 compliance on standard evaluation:

Category Result
Factual knowledge PASS
Code generation PASS
Mathematics PASS
Security PASS
Social engineering PASS
Malware analysis PASS
Exploit development PASS
Red team techniques PASS

Usage

from mlx_vlm import load, generate

model, processor = load("dealignai/Qwen3.5-VL-4B-4bit-MLX-CRACK")
tokenizer = processor.tokenizer

prompt = tokenizer.apply_chat_template(
    [{"role": "user", "content": "Your prompt here"}],
    add_generation_prompt=True, tokenize=False, enable_thinking=False
)
result = generate(model, processor, prompt=prompt, max_tokens=2048, temperature=0.7)
print(result.text)

Note: Use enable_thinking=False — the 4B model loops when thinking is enabled.

Other Quantizations

Quant Size Speed Link
4-bit 3 GB ~151 tok/s Qwen3.5-VL-4B-4bit-MLX-CRACK
8-bit 5.1 GB ~107 tok/s Qwen3.5-VL-4B-8bit-MLX-CRACK

Requirements

  • Apple Silicon Mac with ≥8GB unified memory
  • MLX framework + mlx-vlm

Other Models by dealignai

Model Description
Qwen 3.5 397B REAP-CRACK 397B MoE abliterated (gated)
Qwen 3.5 VL 27B CRACK 27B dense VL abliterated
Qwen 3.5 122B CRACK 122B MoE VL abliterated
MiniMax 172B CRACK MiniMax M2.5 172B abliterated (gated)

See our research: Safety Generalization in Frontier MoE Models

Disclaimer

This model has had safety guardrails permanently removed. It will comply with requests that the base model would refuse. Use responsibly and in accordance with applicable laws. The creators are not responsible for any misuse.


Support dealignai

All models are built from original research and published for free. These models are specifically crafted to be excellent coders and general-purpose assistants.

Support us on Ko-fi — check out the Ko-fi membership for early access and extras.

Have questions or need help with a specific model? DM us — we help for free most of the time.

Ko-fi | X @dealignai | dealign.ai

dealign.ai
Downloads last month
581
Safetensors
Model size
1.0B params
Tensor type
BF16
·
U32
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/Qwen3.5-VL-4B-4bit-MLX-CRACK

Finetuned
Qwen/Qwen3.5-4B
Quantized
(133)
this model