KARAGE-KUN
/

Qwen3.5-0.8b-distill-4b

Model card Files Files and versions

Qwen3.5-0.8B Coding Distilled

項目	値
生徒モデル	unsloth/Qwen3.5-0.8B
教師モデル	unsloth/Qwen3.5-4B
データ	CodeAlpaca-20k (12000 samples)
蒸留方式	KL Divergence + CE (α=0.4, T=3.0)
訓練時間	2.54h

Downloads last month: 14

Safetensors

Model size

0.9B params

Tensor type

F32

·

F16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for KARAGE-KUN/Qwen3.5-0.8b-distill-4b

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B

Finetuned

(159)

this model