This model was converted to MLX format and quantized from Qwen3.6-35B-A3B using oMLX.

What is "oQ"?

See "oQ: oMLX Universal Dynamic Quantization" for details.

Quantizations

Text-Only	Vision-Language	Text-Only FP16	Vision-Language FP16
MLX-oQ8	MLX-VL-oQ8	MLX-oQ8-FP16	MLX-VL-oQ8-FP16
MLX-oQ6	MLX-VL-oQ6	MLX-oQ6-FP16	MLX-VL-oQ6-FP16
MLX-oQ5	MLX-VL-oQ5	MLX-oQ5-FP16	MLX-VL-oQ5-FP16
MLX-oQ4	MLX-VL-oQ4	MLX-oQ4-FP16	MLX-VL-oQ4-FP16

See "Evaluation of various MLX quantizations" for details:

What is "VL"?

"VL" is Vision-Language, meaning quantization preserves the original model's multimodality.

No "VL" means quantization is Text-Only.

What is "FP16"?

"FP16" is an M1/M2 Apple Silicon tweak that delivers a very noticeable prompt processing boost, because older M-series lack native BF16 hardware support. See jundot/omlx/issues/604 for details.

No "FP16" means quantization is better suited for M3+ Apple Silicon.

Downloads last month: 4,299

Safetensors

Model size

6B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for deepsweet/Qwen3.6-35B-A3B-MLX-oQ4

Base model

Qwen/Qwen3.6-35B-A3B

Quantized

(305)

this model

Collection including deepsweet/Qwen3.6-35B-A3B-MLX-oQ4

Qwen3.6-35B-A3B

Collection

17 items • Updated 1 day ago • 1