Coding without MoEs
Collection
some slower than others • 90 items • Updated
Brainwaves
Jan-v1-4B
qx86-hi 0.433,0.538,0.711,0.580,0.390,0.728,0.632
Jan-v1-2509
qx86-hi 0.435,0.540,0.729,0.588,0.388,0.730,0.633
Jan-v3-4B-base-instruct
qx86-hi 0.452,0.600,0.846,0.457,0.392,0.699,0.563
qx64-hi 0.440,0.593,0.845,0.440,0.398,0.694,0.549
mxfp4 0.409,0.537,0.830,0.446,0.384,0.700,0.551
Perplexity
qx86-hi 5.593 ± 0.046
qx64-hi 5.732 ± 0.047
mxfp4 6.191 ± 0.052
-G
This model Jan-v3-4B-base-instruct-qx86-hi-mlx was converted to MLX format from janhq/Jan-v3-4B-base-instruct using mlx-lm version 0.30.4.
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("Jan-v3-4B-base-instruct-qx86-hi-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_dict=False,
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
8-bit
Base model
Qwen/Qwen3-4B-Instruct-2507