Qwen3.5-9B R-Series Distillation - a cudabenchmarktest Collection

updated 2 days ago

Qwen3.5-9B LoRA SFT distillation: R7 (86.8% eval) + R8 calibration. Datasets, FP16 checkpoints, and pipeline docs.

cudabenchmarktest/r7-additive-sft

Viewer • Updated 8 days ago • 4.04k • 44

Note R7 SFT training data (4,043 samples) — Stratos + Tulu3 + SlimOrca + PrimeIntellect. Scored 86.8%.
cudabenchmarktest/r8-calibration-sft

Viewer • Updated 8 days ago • 7.64k • 53

Note R8 SFT training data (7,635 samples) — R7 base + 7 calibration layers. Teaches 'I don't know' via R-Tuning.
cudabenchmarktest/qwen3.5-9b-r7-research

Text Generation • 9B • Updated 8 days ago • 474

Note R7 text — FP16 safetensors. 86.8% diverse eval. Thinking + tools + instruction following.
cudabenchmarktest/qwen3.5-9b-r7-research-vision

Image-Text-to-Text • 9B • Updated 8 days ago • 34

Note R7 vision — FP16 safetensors. Same reasoning + vision tower preserved byte-for-byte from base.
cudabenchmarktest/qwen3.5-9b-r5-research-GGUF

Text Generation • 9B • Updated 8 days ago • 1.2k

Note R5 GGUF (superseded by R7). First round with production-quality data. 84.2% diverse eval.
cudabenchmarktest/r9-research-framework

Viewer • Updated 8 days ago • 1.55k • 143

Note R9 research framework — policy, harness, criticality map, HERETIC integration, refusal policy + eval bucket 06. Pre-emptive release (trained R9 model pending R8b eval).