Qwen3.6-35B-A3B-RotorQuant-MLX-MXFP4

Summary

RotorQuant + MLX-MXFP4 (4-bit) variant of Qwen/Qwen3.6-35B-A3B.

Why this variant

Apple Silicon (M1/M2/M3/M4) with RotorQuant structural pre-conditioning and MLX-native MXFP4 layout (E2M1 weights, per-32-element E8M0 (OCP microscaling)). 4.253 bits/weight, ~17 GB on disk, sub-2-s load on M4 Max. Pick this over the affine MLX variants when you want MXFP4 format parity with hardware pipelines while running locally.

Hardware compatibility

Device	VRAM	Recommendation
Apple M4 Max 128 GB	~21 GB	recommended — headroom for long context
Apple M3 Max 64 GB	~21 GB	fits comfortably
Apple M2 Max 32 GB	~21 GB	tight — short context only

Reproduce

# dequantize from the rotor/turbo MLX-8bit source, then re-quantize
python -c "from mlx_lm import convert; convert(hf_path=\"majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit\", mlx_path=\"bf16\", dequantize=True, trust_remote_code=True)"
python -c "from mlx_lm import convert; convert(hf_path=\"bf16\", mlx_path=\"out-mxfp4\", quantize=True, q_bits=4, q_group_size=32, q_mode=\"mxfp4\", trust_remote_code=True)"

Reproduced at commit 919836a.

Evaluation

benchmarks pending — populated after the eval-harness workstream lands.

Family

bf16 — Qwen/Qwen3.6-35B-A3B
FP8 card — majentik/Qwen3.6-35B-A3B-FP8
RotorQuant MLX-4bit (affine) — majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-4bit
RotorQuant MLX-8bit (source for this) — majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit
plain MLX-MXFP4 (no rotor/turbo) — majentik/Qwen3.6-35B-A3B-MLX-MXFP4

Provenance

Source SHA: majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit
Calibration hash: none (mxfp4 is calibration-free; rotor/turbo conditioning inherited from source)
Uploaded: 2026-04-21T06:14:03.928693+00:00

Toolchain:

huggingface_hub: 1.11.0
mlx: 0.31.1
mlx-lm: 0.31.2

License

Released under apache-2.0. Upstream license of the base model applies.

Downloads last month: 552

Safetensors

Model size

35B params

Tensor type

U32

BF16

MLX

Hardware compatibility

4-bit

Model tree for majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-MXFP4

Base model

Qwen/Qwen3.6-35B-A3B

Quantized

(364)

this model