Huihui-Qwen3.5-0.8B-abliterated — MXFP8 MLX
MLX-converted version of huihui-ai/Huihui-Qwen3.5-0.8B-abliterated for Apple Silicon, quantized with MXFP8.
Model Details
| Property | Value |
|---|---|
| Base model | Qwen3.5-0.8B |
| Type | Vision-Language Model (VLM) |
| Format | MLX MXFP8 (~9.2 bits/weight) |
| Size | ~0.98 GB |
| Abliterated | Yes — censorship layers removed |
MXFP8 provides near-native quality with ~2x compression vs fp16 — better than traditional 8-bit quantization (lower perplexity, higher benchmark scores).
Variants
| Variant | Size | Quality | Link |
|---|---|---|---|
| fp16 | ~1.75 GB | Highest | fp16 |
| MXFP8 | ~0.98 GB | Near-native | This repo |
| MXFP4 | ~0.6 GB | Good | mxfp4 |
Usage
pip install mlx-vlm
# Text generation
python -m mlx_vlm.generate \
--model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX \
--prompt "Describe this image in detail" \
--image <path-or-url>
# Chat UI
python -m mlx_vlm.chat_ui \
--model AITRADER/Huihui-Qwen3.5-0.8B-abliterated-mxfp8-MLX
Credits
- Downloads last month
- 73
Model size
0.3B params
Tensor type
U8
·
U32 ·
BF16 ·
F32 ·
Hardware compatibility
Log In to add your hardware
8-bit