Qwen2.5-7B-Instruct โ€” SFT on Tool Use (lr=5e-5, bs=32, 1 epoch)

SFT baseline for reproducing "Self-Distillation Enables Continual Learning".

Training Details

Parameter Value
Base model Qwen/Qwen2.5-7B-Instruct
Method Supervised Fine-Tuning (SFT)
Dataset ToolAlpaca (4046 train, 68 test)
Learning rate 5e-5
Batch size 32 (gradient accumulation)
Epochs 1
Seed 42
DeepSpeed ZeRO-2 + CPU offload
Hardware L40S 48GB

Evaluation

Not yet evaluated. Greedy accuracy + pass@k pending.

Paper's SFT baseline: 63.2% greedy accuracy on tool-use.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Ayushnangia/qwen2.5-7b-instruct-sft-tooluse-lr5e-5-bs32-ep1")
tokenizer = AutoTokenizer.from_pretrained("Ayushnangia/qwen2.5-7b-instruct-sft-tooluse-lr5e-5-bs32-ep1")

Related

Downloads last month
2
Safetensors
Model size
333k params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ayushnangia-sdft/qwen2.5-7b-instruct-sft-tooluse-lr5e-5-bs32-ep1

Base model

Qwen/Qwen2.5-7B
Finetuned
(3219)
this model

Paper for ayushnangia-sdft/qwen2.5-7b-instruct-sft-tooluse-lr5e-5-bs32-ep1