Qwen3.5-9B Uncensored β€” MNN Format

This is an MNN-converted version of huihui-ai/Huihui-Qwen3.5-9B-abliterated for on-device mobile inference.

All credit for the abliteration work goes to huihui-ai. We only performed the MNN conversion and quantization for mobile deployment.

What is this?

  • Base model: Qwen/Qwen3.5-9B by Alibaba
  • Abliteration by: huihui-ai β€” removes refusal behavior via orthogonal projection (FailSpy technique)
  • MNN conversion by: darkmaniac7 β€” 4-bit quantization (block size 128) for mobile GPU/CPU inference
  • Purpose: On-device roleplay, creative fiction, and mature content without refusal. Richer writing and deeper character interactions than the 4B variant.

Model Details

Property Value
Architecture Qwen3.5 (LinearAttention)
Parameters 9B
Quantization 4-bit (block 128)
Format MNN (Alibaba Mobile Neural Network)
Size on disk ~5.0 GB
Backend CPU (auto-routed β€” LinearAttention is faster on CPU than OpenCL)
Minimum RAM 12 GB

Performance (measured on-device)

Device SoC Backend Decode tok/s
RedMagic 11 Pro SM8850 (SD 8 Elite 2) CPU 10.1
Lenovo TB520FU SM8650 (SD 8 Gen 3) CPU ~8.5

Usage

This model is designed for TokForge, an offline Android AI chat app. It can also be used with any MNN-compatible runtime.

TokForge (Android)

Models β†’ Recommended β†’ Roleplay β†’ "Qwen3.5 9B Uncensored" β†’ Download

Manual

Download all files and load with MNN's llm_demo or the MNN Transformer API.

Limitations and Intended Use

  • Intended for TokForge / MNN mobile inference and local roleplay-style use.
  • Qwen3.5 LinearAttention models route differently from standard Qwen3 targets and may prefer CPU on some phones.
  • Large-model mobile performance depends heavily on device memory pressure and backend routing.
  • This repo is a mobile runtime/export artifact, not a standard Transformers release.

Files

File Size Description
llm.mnn 3.5 MB Model graph
llm.mnn.weight 4.2 GB 4-bit quantized weights
embeddings_bf16.bin 1.9 GB Embedding weights (untied)
llm_config.json 8 KB Model configuration
tokenizer.txt 6.1 MB Tokenizer vocabulary
config.json 342 B HuggingFace config

Attribution

Community

License

Apache 2.0 (inherited from Qwen3.5)

Downloads last month
129
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for darkmaniac7/Qwen3.5-9B-uncensored-MNN

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(8)
this model