ReframeBot-SFT-Llama3.1-8B

LoRA adapter for meta-llama/Meta-Llama-3.1-8B-Instruct, fine-tuned with Supervised Fine-Tuning (SFT) to support CBT-style Socratic questioning for university students under academic stress.

This is stage 1 of the ReframeBot training pipeline. The DPO adapter (ReframeBot-DPO-Llama3.1-8B) was initialised from this checkpoint.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "Nhatminh1234/ReframeBot-SFT-Llama3.1-8B"

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Training Details

Hyperparameter	Value
Base model	meta-llama/Meta-Llama-3.1-8B-Instruct
LoRA rank (r)	6
LoRA alpha	12
LoRA dropout	0.05
Learning rate	2e-4
Optimizer	paged_adamw_8bit
Effective batch size	8 (1 × grad_accum 8)
Epochs	3
Max sequence length	384
Quantization	4-bit NF4, bfloat16 compute
Hardware	NVIDIA RTX 5070 (laptop, 8 GB VRAM)

Dataset: 4,500 synthetic multi-turn dialogues generated with GPT-4, covering academic stress scenarios (exam anxiety, GPA pressure, deadline overwhelm, imposter syndrome, burnout). All conversations follow CBT Socratic questioning patterns.

Intended Use

Designed as a component in the ReframeBot system — not a standalone mental-health tool. Must not be used for clinical intervention or crisis support without human oversight.

Project

GitHub: ReframeBot

Downloads last month: 21

Model tree for Nhatminh1234/ReframeBot-SFT-Llama3.1-8B

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Adapter

(1961)

this model