Conlang Morphology — QwQ Distillation

LoRA adapter for Qwen/Qwen2.5-1.5B-Instruct fine-tuned on conlang morphology via QwQ-32B Distillation.

Part of the Algorithmic SFT vs Distillation experiment studying whether deterministic algorithmic templates teach procedural reasoning more effectively than distillation from large reasoning models.

Training

Parameter	Value
Base model	Qwen/Qwen2.5-1.5B-Instruct
Method	QwQ-32B Distillation
Framework	LLaMA-Factory (SFT stage)
LoRA rank	64
LoRA target	all linear layers
Learning rate	1e-4
Epochs	3
Batch size	1 (grad accum 16)
Cutoff length	32,768 tokens
Training data	5,000 QwQ-32B reasoning traces (d7+d5, 3 samples/question, filtered). Teacher solve rate: 15.8-44.3%

Evaluation (v3, MAX_TOKENS=32768)

Split	Accuracy
Test (in-distribution)	40.4%
Harder variant	11.0%
Structural OOD	38.4%

Notes

Large gap vs algo SFT (40.4% vs 98.6%). Distillation traces don't transfer the rule composition procedure.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
model = PeftModel.from_pretrained(base, "reasoning-degeneration-dev/algo-sft-conlang-morphology-distill-qwq")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")