Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated MLX MXFP8

MXFP8 (Microscaling FP8) quantized MLX version of Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated.

Model Details

Architecture: Qwen 3.5 27B (hybrid linear attention + full attention)
Quantization: MXFP8 (E4M3 with block-level scaling), group_size=32
Size: ~29 GB
Context Length: 262,144 tokens
Vision: Full image and video understanding via integrated vision tower (27 ViT blocks, kept in bf16)
Tool Use: Native function calling support
Thinking: Chain-of-thought reasoning mode

MXFP8 uses floating-point (E4M3) representation with per-block scaling instead of fixed-point integer quantization. This gives:

Works with LM Studio, MLX, and other MLX-compatible frameworks.

Safetensors

Model size

27B params

Tensor type

U32

BF16

F32

MLX

Hardware compatibility

8-bit