Qwen3.5-4B Uncensored β€” MNN Format

This is an MNN-converted version of huihui-ai/Huihui-Qwen3.5-4B-abliterated for on-device mobile inference.

All credit for the abliteration work goes to huihui-ai. We only performed the MNN conversion and quantization for mobile deployment.

What is this?

  • Base model: Qwen/Qwen3.5-4B by Alibaba
  • Abliteration by: huihui-ai β€” removes refusal behavior via orthogonal projection (FailSpy technique)
  • MNN conversion by: darkmaniac7 β€” 4-bit quantization (block size 128) for mobile GPU/CPU inference
  • Purpose: On-device roleplay, creative fiction, and mature content without refusal

Model Details

Property Value
Architecture Qwen3.5 (LinearAttention)
Parameters 4B
Quantization 4-bit (block 128)
Format MNN (Alibaba Mobile Neural Network)
Size on disk ~2.5 GB
Backend CPU (auto-routed β€” LinearAttention is faster on CPU than OpenCL)

Performance (measured on-device)

Device SoC Backend Decode tok/s
RedMagic 11 Pro SM8850 (SD 8 Elite 2) CPU 17.7–19.8
Samsung S26 Ultra SM8850 CPU ~18–20
Samsung S24 Ultra SM8650 (SD 8 Gen 3) CPU ~14

Usage

This model is designed for TokForge, an offline Android AI chat app. It can also be used with any MNN-compatible runtime.

TokForge (Android)

Models β†’ Recommended β†’ Roleplay β†’ "Qwen3.5 4B Uncensored" β†’ Download

Manual

Download all files and load with MNN's llm_demo or the MNN Transformer API.

Limitations and Intended Use

  • Intended for TokForge / MNN mobile inference and local roleplay-style use.
  • Backend behavior differs from classic Qwen3 because Qwen3.5 uses LinearAttention.
  • Device performance varies significantly across SoCs and CPU/GPU routing.
  • This repo is a mobile runtime/export artifact, not a standard Transformers release.

Files

File Size Description
llm.mnn 3.5 MB Model graph
llm.mnn.weight 2.3 GB 4-bit quantized weights
llm_config.json 8 KB Model configuration
tokenizer.txt 2.9 MB Tokenizer vocabulary
config.json 342 B HuggingFace config

Attribution

Community

License

Apache 2.0 (inherited from Qwen3.5)

Downloads last month
102
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for darkmaniac7/Qwen3.5-4B-uncensored-MNN

Finetuned
Qwen/Qwen3.5-4B
Finetuned
(5)
this model