The notes I'd posted (which I guess now can't be accessed) were related to benchmark scores I'd got testing this model with the same vllm command I used for the base Qwen3.6. I've been publishing some benchmarks for models that fit on a DGX Spark (or I guess 128GB Strix Halo) at https://github.com/DanTup/spark-evals but unfortunately this model didn't work well (I think because something is wrong with tool calls).
If whatever the issue is can be fixed, I will re-run them and update (and, if there are any other fine-tuned versions of models that fit on a Spark, I'd also be interested in trying them too - feel free to open issues in the repo if anyone has suggestions).
Same here, all Jackrong models I have tested (Qwen3.5-9B-DeepSeek-V4-Flash, Qwen3.5-9B-GLM5.1-Distill-v1, Qwopus3.5-9B-v3.5 and this one) improve on Stevibe's BenchLocal benchmark suit, except for bench packs which involve tool calling (CLI-40, Hermes-20 and StructuredOutput-15). No idea why.
Qwen3.5-9B-DeepSeek-V4-Flash and Qwopus3.5-9B-v3.5 are both improvements compared to the base model. Qwen3.6 models have way more post-training done to them so they break even with small amounts of training.