Coding with MoEs
Collection
Unleash the inner experts • 105 items • Updated • 2
The video part was removed, no other changes.
For improved versions with vision support from NightmediaAI, check out our models
Brainwaves
arc arc/e boolq hswag obkqa piqa wino
qx86-hi 0.420,0.457,0.379,0.671,0.354,0.777,0.702
qx64-hi 0.413,0.459,0.378,0.670,0.366,0.772,0.687
mxfp4 0.413,0.464,0.378,0.675,0.364,0.771,0.687
nightmedia/Qwen3.5-35B-A3B-Instruct
qx86-hi 0.554,0.670,0.891,...
Quant Perplexity Max Memory
qx86-hi 4.042 ± 0.026 41.52 GB
qx64-hi 4.073 ± 0.026 32.86 GB
Models from NightmediaAI
nightmedia/Qwen3.5-35B-A3B-Holodeck
qx86-hi 0.540,0.647,0.890,0.690,0.412,0.792,0.679
nightmedia/Qwen3.5-35B-A3B-Engineer
qx64-hi 0.543,0.666,0.886,0.692,0.412,0.785,0.684
mxfp4 0.438,0.546,0.894,0.645,0.384,0.719,0.569
nightmedia/Qwen3-30B-A3B-Element11b
mxfp8 0.575,0.712,0.880,0.745,0.470,0.796,0.706
qx86-hi 0.586,0.757,0.880,0.753,0.458,0.805,0.705
qx64-hi 0.576,0.759,0.876,0.752,0.470,0.803,0.698
mxfp4 0.550,0.714,0.877,0.747,0.432,0.798,0.695
Older models with VL
Qwen3-VL-30B-A3B-Instruct
qx86-hi 0.439,0.541,0.894,0.619,0.430,0.764,0.592
qx64-hi 0.454,0.544,0.893,0.618,0.428,0.749,0.590
Qwen3-VL-30B-A3B-Thinking
qx86-hi 0.393,0.466,0.751,0.648,0.366,0.776,0.667
For a thinking model, it performs above the previous VL MoE.
This quant has been updated with the new Deckard(qx) formula for Qwen3.5, and includes some attention layers new to the architecture.
While the changes seem minimal for the added weight, the model is more confident, with better attention to detail, and the vibe quality has improved.
Old formula metrics
qx86-hi 0.418,...
qx64-hi 0.419,...
Perplexity
qx86-hi 4.037 ± 0.026 37.46 GB
qx64-hi 4.079 ± 0.026 28.79 GB
More metrics will be available soon
-G
This model Qwen3.5-35B-A3B-Text-qx64-hi-mlx was converted to MLX format from Qwen/Qwen3.5-35B-A3B using mlx-lm version 0.30.8.
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Qwen3.5-35B-A3B-Text-qx64-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_dict=False,
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
6-bit