⚠️ CRITICAL: Ollama Inference Flag Required

If you serve this model via Ollama with the qwen3.5 RENDERER, you MUST pass "think": false in the /api/chat request body for chat / instruction following / tool use.

Without this flag, 25-46% of requests will return empty answers due to renderer-injected <think> tags. See cudabenchmarktest/r9-research-framework/_OLLAMA_INFERENCE_WARNING.md for full details.

Qwen3.5-9B R5 Research (GGUF)

Fine-tuned Qwen3.5-9B with distilled reasoning. R5 was the first round using production-quality data sources and achieved 84.2% on diverse stochastic eval. Superseded by R7 (86.8%).

For Ollama: ollama run robit/qwen3.5-9b-r5-research:q4km

Capabilities

Thinking — structured reasoning in <think> blocks
Tool calling — structured tool_calls via Ollama API
Instruction following — concise answers, format constraints, system prompt adherence

Eval Results

Benchmark	Score
Diverse stochastic eval (38 tests, 9 categories)	84.2%
Base qwen3.5:9b on same eval	79.0%

Training

Base model: Qwen/Qwen3.5-9B
Method: LoRA SFT (r=32, alpha=64, LR=1e-4, 1 epoch)
Data: 4122 samples from:
- bespokelabs/Bespoke-Stratos-17k — DeepSeek-R1 reasoning traces
- allenai/tulu-3-sft-mixture — instruction diversity
- Open-Orca/SlimOrca — curated GPT-4 instructions
Training suite: robit-man/fine_tuning_suite

Models

Format	Link
Ollama (Q4_K_M)	robit/qwen3.5-9b-r5-research
Ollama Vision (Q4_K_M)	robit/qwen3.5-9b-r5-vision
Successor: R7 (FP16)	cudabenchmarktest/qwen3.5-9b-r7-research
Successor: R7 Vision (FP16)	cudabenchmarktest/qwen3.5-9b-r7-research-vision