ReframeBot-SFT-Llama3.1-8B

LoRA adapter for meta-llama/Meta-Llama-3.1-8B-Instruct, fine-tuned with Supervised Fine-Tuning (SFT) to support CBT-style Socratic questioning for university students under academic stress.

This is stage 1 of the ReframeBot training pipeline. The DPO adapter (ReframeBot-DPO-Llama3.1-8B) was initialised from this checkpoint.

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "Nhatminh1234/ReframeBot-SFT-Llama3.1-8B"

bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

Training Details

Hyperparameter Value
Base model meta-llama/Meta-Llama-3.1-8B-Instruct
LoRA rank (r) 6
LoRA alpha 12
LoRA dropout 0.05
Learning rate 2e-4
Optimizer paged_adamw_8bit
Effective batch size 8 (1 × grad_accum 8)
Epochs 3
Max sequence length 384
Quantization 4-bit NF4, bfloat16 compute
Hardware NVIDIA RTX 5070 (laptop, 8 GB VRAM)

Dataset: 4,500 synthetic multi-turn dialogues generated with GPT-4, covering academic stress scenarios (exam anxiety, GPA pressure, deadline overwhelm, imposter syndrome, burnout). All conversations follow CBT Socratic questioning patterns.

Intended Use

Designed as a component in the ReframeBot system — not a standalone mental-health tool. Must not be used for clinical intervention or crisis support without human oversight.

Project

GitHub: ReframeBot

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Nhatminh1234/ReframeBot-SFT-Llama3.1-8B

Adapter
(1961)
this model