Qwen2.5-7B-Instruct β€” SDFT on Tool Use (Step 1000, Best)

Best checkpoint from SDFT (Self-Distillation Fine-Tuning) reproduction of "Self-Distillation Enables Continual Learning".

Results

Metric Base This Model Paper
Greedy Accuracy 54.4% 64.7% 70.6%
Pass@1 52.6% 56.2% β€”
Pass@5 61.5% 70.1% β€”
Pass@10 64.4% 74.4% β€”
Pass@50 70.6% 79.4% β€”

Training Details

Parameter Value
Base model Qwen/Qwen2.5-7B-Instruct
Method On-policy Self-Distillation (SDFT)
Dataset ToolAlpaca (4046 train, 68 test)
Learning rate 1e-5
Batch size 32
Epochs 2
EMA alpha 0.01
Step 1000 (best of 1011)
Hardware L40S 48GB

All Checkpoints

Step Greedy Acc
100 55.9%
200 48.5%
300 44.1%
400 47.1%
500 57.4%
600 47.1%
700 54.4%
800 52.9%
900 57.4%
1000 64.7%
1011 57.4%

Related

Downloads last month
2
Safetensors
Model size
333k params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for ayushnangia-sdft/qwen2.5-7b-instruct-sdft-tooluse-step-1000

Base model

Qwen/Qwen2.5-7B
Finetuned
(3213)
this model

Paper for ayushnangia-sdft/qwen2.5-7b-instruct-sdft-tooluse-step-1000