Llama-3.2-3B-TUTOR-gsm8k
This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct trained on the openai/gsm8k dataset. It has been trained to enhance mathematical reasoning and step-by-step problem-solving capabilities.
The model was fine-tuned using Parameter-Efficient Fine-Tuning (PEFT) with LoRA in a 4-bit quantized environment (QLoRA) and later merged into the base model weights for easy deployment.
Model Details
- Model developer: rajtembe13
- Base model: meta-llama/Llama-3.2-3B-Instruct
- Task: Text Generation (Mathematical Word Problems / Reasoning)
- Language: English
Training Details
Dataset
The model was fine-tuned on the main subset of the GSM8K (Grade School Math 8K) dataset. The data was formatted as prompt-completion pairs to teach the model to reason through math problems before providing an answer.
Training Procedure
The training was conducted using the SFTTrainer from the trl library, utilizing QLoRA. The base model was loaded in 4-bit (nf4) precision using bitsandbytes, and LoRA adapters were applied before training. After training, the LoRA adapters were merged back into the base model.
LoRA Configuration
- Rank (r): 16
- Target Modules:
["q_proj", "v_proj"] - Task Type:
CAUSAL_LM
Training Hyperparameters
- Max steps: 4000
- Learning rate: 3e-5
- Warmup steps: 250
- Train batch size (per device): 1
- Gradient accumulation steps: 2
- Optimizer:
paged_adamw_8bit - Mixed precision:
bf16
Framework Versions
- PEFT
- TRL
- Transformers
- Datasets
- BitsAndBytes
- PyTorch
- Accelerate
How to use
You can load and use this merged model directly via the transformers library:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "rajtembe13/Llama-3.2-3B-TUTOR-gsm8k"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Example math problem
question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
prompt = f"Question :{question}\nAnswer :"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
# Generate output
outputs = model.generate(**inputs, max_new_tokens=200)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
- Downloads last month
- 292
Model tree for rajtembe13/Llama-3.2-3B-TUTOR-gsm8k
Base model
meta-llama/Llama-3.2-3B-Instruct