Llama-3.2-3B-TUTOR-gsm8k

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct trained on the openai/gsm8k dataset. It has been trained to enhance mathematical reasoning and step-by-step problem-solving capabilities.

The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA in a 4-bit quantized environment (QLoRA) and later merged into the base model weights for easy deployment.

Model Details

Model developer: rajtembe13
Base model: meta-llama/Llama-3.2-3B-Instruct
Task: Text Generation (Mathematical Word Problems / Reasoning)
Language: English

Training Details

Dataset

The model was fine-tuned on the main subset of the GSM8K (Grade School Math 8K) dataset. The data was formatted as prompt-completion pairs to teach the model to reason through math problems before providing an answer.

Training Procedure

The training was conducted using the SFTTrainer from the trl library, utilizing QLoRA. The base model was loaded in 4-bit (nf4) precision using bitsandbytes, and LoRA adapters were applied before training. After training, the LoRA adapters were merged back into the base model.

LoRA Configuration

Rank (r): 16
Target Modules: ["q_proj", "v_proj"]
Task Type: CAUSAL_LM

Training Hyperparameters

Max steps: 4000
Learning rate: 3e-5
Warmup steps: 250
Train batch size (per device): 1
Gradient accumulation steps: 2
Optimizer: paged_adamw_8bit
Mixed precision: bf16

Framework Versions

PEFT
TRL
Transformers
Datasets
BitsAndBytes
PyTorch
Accelerate

How to use

You can load and use this merged model directly via the transformers library:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "rajtembe13/Llama-3.2-3B-TUTOR-gsm8k"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Example math problem
question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
prompt = f"Question :{question}\nAnswer :"

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

# Generate output
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

Downloads last month: 292

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for rajtembe13/Llama-3.2-3B-TUTOR-gsm8k

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(679)

this model

rajtembe13
/

Llama-3.2-3B-TUTOR-gsm8k