JanusCoder-8B-Blossom-Nemotron-Claude-Opus-qx86-hi-mlx

Qwen3-8B-Element2-qx86-hi-mlx

This model is a 1.4/0.6 nuslerp merge of:

Azure99/Blossom-V6.3-8B
nightmedia/Qwen3-8B-Element (1.4/0.6 nuslerp)
- unsloth/JanusCoder-8B
- TeichAI/Nemotron-Cascade-8B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill

The Nemotron-Cascade base is prone to looping, mainly for the lack of social skills: the addition of just Claude thinking traces without a body of evidence made the Element very smart, but unstable, even with Janus help.

I used Blossom to add some "words" and stabilize the inference. You can see how arc numbers improved, and arc/easy let down a bit(less OCD). Logic is back up, close to Janus levels.

Brainwaves

          arc   arc/e boolq hswag obkqa piqa  wino
qx86-hi   0.538,0.732,0.860,0.720,0.414,0.783,0.646
qx64-hi   0.526,0.720,0.862,0.709,0.426,0.782,0.659
mxfp4     0.515,0.702,0.857,0.708,0.424,0.785,0.655

Janus     0.537,0.731,0.862,0.697,0.446,0.782,0.667
Blossom   0.516,0.706,0.857,0.662,0.424,0.781,0.644
Element   0.532,0.746,0.846,0.738,0.456,0.794,0.709

Element2  0.538,0.732,0.860,0.720,0.414,0.783,0.646

Perplexity
qx86-hi   4.487 ± 0.032
qx64-hi   4.583 ± 0.034
mxfp4     4.919 ± 0.036

-G

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("JanusCoder-8B-Blossom-Nemotron-Claude-Opus-qx86-hi-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)