Huihui-Qwen3.5-0.8B-abliterated — MXFP8 MLX

MLX-converted version of huihui-ai/Huihui-Qwen3.5-0.8B-abliterated for Apple Silicon, quantized with MXFP8.

Model Details

Property Value
Base model Qwen3.5-0.8B
Type Vision-Language Model (VLM)
Format MLX MXFP8 (~9.2 bits/weight)
Size ~0.98 GB
Abliterated Yes — censorship layers removed

MXFP8 provides near-native quality with ~2x compression vs fp16 — better than traditional 8-bit quantization (lower perplexity, higher benchmark scores).

Variants

Variant Size Quality Link
fp16 ~1.75 GB Highest fp16
MXFP8 ~0.98 GB Near-native This repo
MXFP4 ~0.6 GB Good mxfp4

Usage

pip install mlx-vlm

# Text generation
python -m mlx_vlm.generate \
  --model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX \
  --prompt "Describe this image in detail" \
  --image <path-or-url>

# Chat UI
python -m mlx_vlm.chat_ui \
  --model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX

Credits

Downloads last month
73
Safetensors
Model size
0.3B params
Tensor type
U8
·
U32
·
BF16
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX

Quantized
(95)
this model