ReframeBot-SFT-Llama3.1-8B
LoRA adapter for meta-llama/Meta-Llama-3.1-8B-Instruct, fine-tuned with
Supervised Fine-Tuning (SFT) to support CBT-style Socratic questioning for
university students under academic stress.
This is stage 1 of the ReframeBot training pipeline. The DPO adapter (ReframeBot-DPO-Llama3.1-8B) was initialised from this checkpoint.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
base = "meta-llama/Meta-Llama-3.1-8B-Instruct"
adapter = "Nhatminh1234/ReframeBot-SFT-Llama3.1-8B"
bnb = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)
Training Details
| Hyperparameter | Value |
|---|---|
| Base model | meta-llama/Meta-Llama-3.1-8B-Instruct |
| LoRA rank (r) | 6 |
| LoRA alpha | 12 |
| LoRA dropout | 0.05 |
| Learning rate | 2e-4 |
| Optimizer | paged_adamw_8bit |
| Effective batch size | 8 (1 × grad_accum 8) |
| Epochs | 3 |
| Max sequence length | 384 |
| Quantization | 4-bit NF4, bfloat16 compute |
| Hardware | NVIDIA RTX 5070 (laptop, 8 GB VRAM) |
Dataset: 4,500 synthetic multi-turn dialogues generated with GPT-4, covering academic stress scenarios (exam anxiety, GPA pressure, deadline overwhelm, imposter syndrome, burnout). All conversations follow CBT Socratic questioning patterns.
Intended Use
Designed as a component in the ReframeBot system — not a standalone mental-health tool. Must not be used for clinical intervention or crisis support without human oversight.
Project
GitHub: ReframeBot
- Downloads last month
- 21
Model tree for Nhatminh1234/ReframeBot-SFT-Llama3.1-8B
Base model
meta-llama/Llama-3.1-8B