Qwen3.5-9B R-Series Distillation
Collection
Qwen3.5-9B LoRA SFT distillation: R7 (86.8% eval) + R8 calibration. Datasets, FP16 checkpoints, and pipeline docs. • 6 items • Updated
⚠️ CRITICAL: Ollama Inference Flag Required
If you serve this model via Ollama with the qwen3.5 RENDERER, you MUST pass
"think": falsein the/api/chatrequest body for chat / instruction following / tool use.Without this flag, 25-46% of requests will return empty answers due to renderer-injected
<think>tags. Seecudabenchmarktest/r9-research-framework/_OLLAMA_INFERENCE_WARNING.mdfor full details.
Fine-tuned Qwen3.5-9B with distilled reasoning. R5 was the first round using production-quality data sources and achieved 84.2% on diverse stochastic eval. Superseded by R7 (86.8%).
For Ollama: ollama run robit/qwen3.5-9b-r5-research:q4km
<think> blockstool_calls via Ollama API| Benchmark | Score |
|---|---|
| Diverse stochastic eval (38 tests, 9 categories) | 84.2% |
| Base qwen3.5:9b on same eval | 79.0% |
| Format | Link |
|---|---|
| Ollama (Q4_K_M) | robit/qwen3.5-9b-r5-research |
| Ollama Vision (Q4_K_M) | robit/qwen3.5-9b-r5-vision |
| Successor: R7 (FP16) | cudabenchmarktest/qwen3.5-9b-r7-research |
| Successor: R7 Vision (FP16) | cudabenchmarktest/qwen3.5-9b-r7-research-vision |
Apache 2.0 (inherited from Qwen3.5-9B). Training data licenses vary by source.
We're not able to determine the quantization variants.