Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2 MLX MXFP8 + Vision
MXFP8 quantized MLX version with vision support grafted from Qwen/Qwen3.5-9B.
Model Details
- Architecture: Qwen 3.5 9B (hybrid linear attention + full attention, 32 layers)
- Quantization: MXFP8 (E4M3 with block-level scaling), group_size=32
- Size: ~10 GB
- Context Length: 262,144 tokens
- Vision: Full image and video understanding (27 ViT blocks, kept in bf16)
- Tool Use: Native function calling support
- Thinking: Chain-of-thought reasoning mode
Vision
Vision tower grafted from the base Qwen/Qwen3.5-9B model (out_hidden_size=4096 matches 9B text hidden_size). Kept in bf16 for maximum quality.
Usage
Works with LM Studio, MLX, and other MLX-compatible frameworks.
- Downloads last month
- 1,014
Model size
9B params
Tensor type
U8
路
U32 路
BF16 路
F32 路
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for AITRADER/Qwen3.5-9B-Claude-4.6-Opus-Reasoning-Distilled-v2-mlx-mxfp8
Base model
Qwen/Qwen3.5-9B-Base Finetuned
Qwen/Qwen3.5-9B