Josiefied-Qwen3-8B-abliterated-v1-MNN

Pre-converted Josiefied-Qwen3-8B-abliterated-v1 in MNN format for on-device inference with TokForge.

Original model by Goekdeniz-Guelmez — converted to MNN Q4 for mobile deployment.

Model Details


Architecture	Qwen3 (standard multi-head attention, 36 layers)
Parameters	8B (4-bit quantized)
Format	MNN (Alibaba Mobile Neural Network)
Quantization	W4A16 (4-bit weights, block size 128)
Vocab	151,936 tokens
Source	Goekdeniz-Guelmez/Josiefied-Qwen3-8B-abliterated-v1

Description

Josiefied abliterated v1 by Goekdeniz Guelmez — 8B Qwen3 with abliterated safety filters. Excellent quality-to-speed ratio for flagship phones. Runs comfortably on 12GB+ RAM devices with OpenCL GPU acceleration.

Files

File	Description
`llm.mnn`	Model computation graph
`llm.mnn.weight`	Quantized weight data (Q4, block=128)
`llm_config.json`	Model config with Jinja chat template
`tokenizer.txt`	Tokenizer vocabulary
`config.json`	MNN runtime config

Usage with TokForge

This model is optimized for TokForge — a free Android app for private, on-device LLM inference.

Download TokForge from the Play Store
Open the app → Models → Download this model
Start chatting — runs 100% locally, no internet required

Recommended Settings

Setting	Value
Backend	OpenCL (Qualcomm) / Vulkan (MediaTek) / CPU (fallback)
Precision	Low
Threads	4
Thinking	Off (or On for thinking-capable models)

Speculative Decoding

Pair with the TokForge Acceleration Pack for +27-30% faster generation.

Device	SoC	Backend	tok/s
RedMagic 11 Pro	SM8850 (Snapdragon 8 Elite 2)	OpenCL	14.3 tok/s
Lenovo TB520FU	SM8650 (Snapdragon 8 Gen 3)	OpenCL	9.0 tok/s
OnePlus Ace 5 Ultra	D9400+ (Dimensity 9400)	OpenCL	7.9 tok/s

Performance

Actual speed varies by device, thermal state, and generation length. Typical ranges for this model size:

Device	SoC	Backend	Approx. tok/s
SM8850 (RedMagic)	Snapdragon 8 Elite 2	OpenCL	~14 tok/s
SM8650 (Lenovo)	Snapdragon 8 Gen 3	OpenCL	~10 tok/s
D9400+ (OnePlus)	Dimensity 9400	OpenCL	~9 tok/s

Attribution

This is an MNN conversion of Josiefied-Qwen3-8B-abliterated-v1 by Goekdeniz-Guelmez. All credit for the model architecture, training, and fine-tuning goes to the original author(s). This conversion only changes the runtime format for mobile deployment.

Limitations

Intended for TokForge / MNN on-device inference on Android
This is a runtime bundle, not a standard Transformers training checkpoint
Quantization (Q4) may slightly reduce quality compared to the full-precision original
Abliterated/uncensored models have had safety filters removed — use responsibly

Community

Website: tokforge.ai
Discord: Join our Discord
GitHub: TokForge on GitHub

Export Details

Converted using MNN's llmexport pipeline:

python llmexport.py --path Goekdeniz-Guelmez/Josiefied-Qwen3-8B-abliterated-v1 --export mnn --quant_bit 4 --quant_block 128

Downloads last month: 565

Model tree for darkmaniac7/Josiefied-Qwen3-8B-abliterated-v1-MNN

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Finetuned

Goekdeniz-Guelmez/Josiefied-Qwen3-8B-abliterated-v1

Finetuned

(7)

this model

Collection including darkmaniac7/Josiefied-Qwen3-8B-abliterated-v1-MNN

TokForge Community Models — Uncensored MNN

Collection

Uncensored Qwen3/3.5 models in MNN Q4 format for TokForge mobile inference. Josiefied, Heretic, and Claude variants. • 9 items • Updated about 6 hours ago