Huihui-Qwen3.5-0.8B-abliterated — MXFP8 MLX

MLX-converted version of huihui-ai/Huihui-Qwen3.5-0.8B-abliterated for Apple Silicon, quantized with MXFP8.

Model Details

Property	Value
Base model	Qwen3.5-0.8B
Type	Vision-Language Model (VLM)
Format	MLX MXFP8 (~9.2 bits/weight)
Size	~0.98 GB
Abliterated	Yes — censorship layers removed

MXFP8 provides near-native quality with ~2x compression vs fp16 — better than traditional 8-bit quantization (lower perplexity, higher benchmark scores).

Variants

Variant	Size	Quality	Link
fp16	~1.75 GB	Highest	fp16
MXFP8	~0.98 GB	Near-native	This repo
MXFP4	~0.6 GB	Good	mxfp4

Usage

pip install mlx-vlm

# Text generation
python -m mlx_vlm.generate \
  --model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX \
  --prompt "Describe this image in detail" \
  --image <path-or-url>

# Chat UI
python -m mlx_vlm.chat_ui \
  --model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX

Credits

Original model: huihui-ai
Base architecture: Qwen
MLX conversion: AITRADER

Downloads last month: 73

Safetensors

Model size

0.3B params

Tensor type

U32

BF16

F32

MLX

Hardware compatibility

8-bit

Model tree for AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B

Quantized

(95)

this model