Josiefied-Qwen3-4B-abliterated-v2-MNN

Pre-converted Josiefied-Qwen3-4B-abliterated-v2 in MNN format for on-device inference with TokForge.

Original model by Goekdeniz-Guelmez — converted to MNN Q4 for mobile deployment.

Model Details


Architecture	Qwen3 (standard multi-head attention, 36 layers)
Parameters	4B (4-bit quantized)
Format	MNN (Alibaba Mobile Neural Network)
Quantization	W4A16 (4-bit weights, block size 128)
Vocab	151,936 tokens
Source	Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2

Description

Josiefied abliterated v2 by Goekdeniz Guelmez — refined 4B Qwen3 with abliterated safety filters. The v2 iteration improves on the original with better uncensoring and instruction following. Great balance of speed and quality for everyday mobile use.

Files

File	Description
`llm.mnn`	Model computation graph
`llm.mnn.weight`	Quantized weight data (Q4, block=128)
`llm_config.json`	Model config with Jinja chat template
`tokenizer.txt`	Tokenizer vocabulary
`config.json`	MNN runtime config

Usage with TokForge

This model is optimized for TokForge — a free Android app for private, on-device LLM inference.

Download TokForge from the Play Store
Open the app → Models → Download this model
Start chatting — runs 100% locally, no internet required

Recommended Settings

Setting	Value
Backend	OpenCL (Qualcomm) / Vulkan (MediaTek) / CPU (fallback)
Precision	Low
Threads	4
Thinking	Off (or On for thinking-capable models)

Speculative Decoding

Pair with the TokForge Acceleration Pack for +20-38% faster generation on supported devices.

Device	SoC	Backend	tok/s
RedMagic 11 Pro	SM8850 (Snapdragon 8 Elite 2)	OpenCL	22.4 tok/s
Lenovo TB520FU	SM8650 (Snapdragon 8 Gen 3)	OpenCL	16.9 tok/s
OnePlus Ace 5 Ultra	D9400+ (Dimensity 9400)	OpenCL	15.9 tok/s
Xiaomi Pad 7 Pro	SM8635 (Snapdragon 7+ Gen 3)	OpenCL	9.3 tok/s

Performance

Actual speed varies by device, thermal state, and generation length. Typical ranges for this model size:

Device	SoC	Backend	Approx. tok/s
SM8850 (RedMagic)	Snapdragon 8 Elite 2	OpenCL	~17-24 tok/s
SM8650 (Lenovo)	Snapdragon 8 Gen 3	OpenCL	~15-17 tok/s
SM8635 (Xiaomi)	Snapdragon 7+ Gen 3	OpenCL	~9-12 tok/s
D9400+ (OnePlus)	Dimensity 9400	OpenCL	~9-15 tok/s

Attribution

This is an MNN conversion of Josiefied-Qwen3-4B-abliterated-v2 by Goekdeniz-Guelmez. All credit for the model architecture, training, and fine-tuning goes to the original author(s). This conversion only changes the runtime format for mobile deployment.

Limitations

Intended for TokForge / MNN on-device inference on Android
This is a runtime bundle, not a standard Transformers training checkpoint
Quantization (Q4) may slightly reduce quality compared to the full-precision original
Abliterated/uncensored models have had safety filters removed — use responsibly

Community

Website: tokforge.ai
Discord: Join our Discord
GitHub: TokForge on GitHub

Export Details

Converted using MNN's llmexport pipeline:

python llmexport.py --path Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2 --export mnn --quant_bit 4 --quant_block 128

Downloads last month: 509

Model tree for darkmaniac7/Josiefied-Qwen3-4B-abliterated-v2-MNN

Base model

Qwen/Qwen3-4B-Instruct-2507

Finetuned

Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2

Finetuned

(2)

this model

Collection including darkmaniac7/Josiefied-Qwen3-4B-abliterated-v2-MNN

TokForge Community Models — Uncensored MNN

Collection

Uncensored Qwen3/3.5 models in MNN Q4 format for TokForge mobile inference. Josiefied, Heretic, and Claude variants. • 9 items • Updated about 3 hours ago