Llama-3.2-3B-TUTOR-gsm8k

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct trained on the openai/gsm8k dataset. It has been trained to enhance mathematical reasoning and step-by-step problem-solving capabilities.

The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA in a 4-bit quantized environment (QLoRA) and later merged into the base model weights for easy deployment.

Model Details

  • Model developer: rajtembe13
  • Base model: meta-llama/Llama-3.2-3B-Instruct
  • Task: Text Generation (Mathematical Word Problems / Reasoning)
  • Language: English

Training Details

Dataset

The model was fine-tuned on the main subset of the GSM8K (Grade School Math 8K) dataset. The data was formatted as prompt-completion pairs to teach the model to reason through math problems before providing an answer.

Training Procedure

The training was conducted using the SFTTrainer from the trl library, utilizing QLoRA. The base model was loaded in 4-bit (nf4) precision using bitsandbytes, and LoRA adapters were applied before training. After training, the LoRA adapters were merged back into the base model.

LoRA Configuration

  • Rank (r): 16
  • Target Modules: ["q_proj", "v_proj"]
  • Task Type: CAUSAL_LM

Training Hyperparameters

  • Max steps: 4000
  • Learning rate: 3e-5
  • Warmup steps: 250
  • Train batch size (per device): 1
  • Gradient accumulation steps: 2
  • Optimizer: paged_adamw_8bit
  • Mixed precision: bf16

Framework Versions

  • PEFT
  • TRL
  • Transformers
  • Datasets
  • BitsAndBytes
  • PyTorch
  • Accelerate

How to use

You can load and use this merged model directly via the transformers library:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "rajtembe13/Llama-3.2-3B-TUTOR-gsm8k"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example math problem
question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
prompt = f"Question :{question}\nAnswer :"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

# Generate output
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)
Downloads last month
292
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rajtembe13/Llama-3.2-3B-TUTOR-gsm8k

Adapter
(679)
this model

Dataset used to train rajtembe13/Llama-3.2-3B-TUTOR-gsm8k