--- license: apache-2.0 base_model: Qwen/Qwen3.6-35B-A3B pipeline_tag: image-text-to-text library_name: mlx tags: - qwen - qwen-3.6 - moe - rotor - mlx - nvfp4 - apple-silicon --- # Qwen3.6-35B-A3B-RotorQuant-MLX-NVFP4 ## Summary RotorQuant + MLX-NVFP4 (4-bit) variant of [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B). ## Why this variant Apple Silicon (M1/M2/M3/M4) with RotorQuant structural pre-conditioning and MLX-native NVFP4 layout (E2M1 weights, per-16-element FP8 (NVIDIA Blackwell layout)). 4.503 bits/weight, ~18 GB on disk, sub-2-s load on M4 Max. Pick this over the affine MLX variants when you want NVFP4 format parity with hardware pipelines while running locally. ## Hardware compatibility | Device | VRAM | Recommendation | | --- | --- | --- | | Apple M4 Max 128 GB | ~21 GB | recommended — headroom for long context | | Apple M3 Max 64 GB | ~21 GB | fits comfortably | | Apple M2 Max 32 GB | ~21 GB | tight — short context only | ## Reproduce ```bash # dequantize from the rotor/turbo MLX-8bit source, then re-quantize python -c "from mlx_lm import convert; convert(hf_path=\"majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit\", mlx_path=\"bf16\", dequantize=True, trust_remote_code=True)" python -c "from mlx_lm import convert; convert(hf_path=\"bf16\", mlx_path=\"out-nvfp4\", quantize=True, q_bits=4, q_group_size=16, q_mode=\"nvfp4\", trust_remote_code=True)" ``` Reproduced at commit `919836a`. ## Evaluation _benchmarks pending — populated after the eval-harness workstream lands._ ## Family - **bf16** — [Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B) - **FP8 card** — [majentik/Qwen3.6-35B-A3B-FP8](https://huggingface.co/majentik/Qwen3.6-35B-A3B-FP8) - **RotorQuant MLX-4bit (affine)** — [majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-4bit](https://huggingface.co/majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-4bit) - **RotorQuant MLX-8bit (source for this)** — [majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit](https://huggingface.co/majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit) - **plain MLX-NVFP4 (no rotor/turbo)** — [majentik/Qwen3.6-35B-A3B-MLX-NVFP4](https://huggingface.co/majentik/Qwen3.6-35B-A3B-MLX-NVFP4) ## Provenance - Source SHA: `majentik/Qwen3.6-35B-A3B-RotorQuant-MLX-8bit` - Calibration hash: `none (nvfp4 is calibration-free; rotor/turbo conditioning inherited from source)` - Uploaded: `2026-04-21T06:17:30.021158+00:00` Toolchain: - `huggingface_hub`: 1.11.0 - `mlx`: 0.31.1 - `mlx-lm`: 0.31.2 ## License Released under `apache-2.0`. Upstream license of the base model applies.