Qwen2.5-1.5B-GSM8K-SFT
Fine-tuned Qwen/Qwen2.5-1.5B on GSM8K for math reasoning with a structured answer format.
Training Details
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen2.5-1.5B |
| Method | SFT + LoRA |
| LoRA r / alpha | 32 / 16 |
| LoRA targets | all-linear |
| Epochs | 1 |
| Learning rate | 0.0002 |
| Batch size | 8 × 4 (grad accum) |
| Training examples | 1024 |
| Precision | bf16 |
Answer Format
The model is trained to end responses with:
The answer is: {number}.
System Prompt
You are a helpful math assistant. Solve the problem step by step, then give your final answer as a single number on the last line in exact format
The answer is: {number}.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for tripathysagar/Qwen2.5-1.5B-GSM8K-SFT
Base model
Qwen/Qwen2.5-1.5B