⚠️ CRITICAL: Ollama Inference Flag Required

If you serve this model via Ollama with the qwen3.5 RENDERER (the standard recommended setup), you MUST pass "think": false in the /api/chat request body for chat / instruction following / tool use.
curl -X POST http://localhost:11434/api/chat \
  -d '{"model": "...", "think": false, "messages": [...], "stream": false}'
Without this flag, the renderer auto-injects <think> tags into every chat completion. On longer prompts the model can stay inside the <think> block past the response budget, never emit </think>, and produce zero answer tokens on 25-46% of requests.

Set think: true (or omit) only when you DO want chain-of-thought reasoning (math, planning, complex multi-step). This is Qwen3 dual-mode operation per https://qwenlm.github.io/blog/qwen3/.

See the dataset cudabenchmarktest/r9-research-framework _OLLAMA_INFERENCE_WARNING.md for the full explanation.

Qwen3.5-9B Reasoning Distilled GGUF (R3 Crown)

GGUF quantized version of fine-tuned Qwen3.5-9B with distilled Opus 4.6 reasoning traces. Early iteration (R3) — superseded by R7 (86.8% diverse eval).

Training

Base model: Qwen/Qwen3.5-9B
Method: LoRA SFT (r=32, alpha=64, LR=2e-4)
Data: Crownelius/Opus-4.6-Reasoning-3300x (2160 samples)
Training suite: robit-man/fine_tuning_suite

Note

This early iteration had regressions in instruction following due to monoculture training data. See the training suite for the improved R5/R7 approach.

Successors

Model	Eval	Link
R7 Research	86.8%	cudabenchmarktest/qwen3.5-9b-r7-research
R7 Vision	86.8%	cudabenchmarktest/qwen3.5-9b-r7-research-vision

License

Apache 2.0 (inherited from Qwen3.5-9B).

Downloads last month: 5,893

GGUF

Model size

10B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for cudabenchmarktest/qwen3.5-9b-qwen3.6-reasoning-distilled-GGUF

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Adapter

(130)

this model

cudabenchmarktest
/

qwen3.5-9b-qwen3.6-reasoning-distilled-GGUF

⚠️ CRITICAL: Ollama Inference Flag Required