Qwen2.5-1.5B-GSM8K-SFT

Fine-tuned Qwen/Qwen2.5-1.5B on GSM8K for math reasoning with a structured answer format.

Training Details

Parameter Value
Base model Qwen/Qwen2.5-1.5B
Method SFT + LoRA
LoRA r / alpha 32 / 16
LoRA targets all-linear
Epochs 1
Learning rate 0.0002
Batch size 8 × 4 (grad accum)
Training examples 1024
Precision bf16

Answer Format

The model is trained to end responses with:

The answer is: {number}.

System Prompt

You are a helpful math assistant. Solve the problem step by step, then give your final answer as a single number on the last line in exact format
The answer is: {number}.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tripathysagar/Qwen2.5-1.5B-GSM8K-SFT

Adapter
(510)
this model

Dataset used to train tripathysagar/Qwen2.5-1.5B-GSM8K-SFT