LLaMA 3.2-3B English Therapy - GGUF Q4_K_M

Quantized GGUF version of LLaMA 3.2-3B fine-tuned for empathetic therapy conversations.

Model Details

  • Base Model: meta-llama/Llama-3.2-3B-Instruct
  • Fine-tuned on: 1,390 English therapy conversations
  • Quantization: Q4_K_M (4-bit)
  • File Size: ~1.8 GB (from ~6 GB original)
  • Format: GGUF (compatible with llama.cpp)

Performance Metrics

  • BLEU Score: 1.85
  • ROUGE-1: 0.29
  • ROUGE-L: 0.16
  • Perplexity: 15.97

Usage

With llama.cpp

# Download the model
wget https://huggingface.co/pavishanth-sujeevan/llama-3.2-3b-english-therapy-GGUF/resolve/main/model.gguf

# Run inference
./llama-cli -m model.gguf -p "I'm feeling anxious about my future" -n 200

With llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="model.gguf",
    n_ctx=2048,
    n_threads=4,
    n_gpu_layers=35
)

output = llm(
    "User: I'm feeling stressed about work.\nTherapist:",
    max_tokens=200,
    temperature=0.7,
    top_p=0.9
)

print(output["choices"][0]["text"])

With Streamlit

import streamlit as st
from llama_cpp import Llama

@st.cache_resource
def load_model():
    return Llama(model_path="model.gguf", n_ctx=2048, n_gpu_layers=35)

llm = load_model()
user_input = st.text_input("How are you feeling?")

if user_input:
    response = llm(f"User: {user_input}\nTherapist:", max_tokens=200)
    st.write(response["choices"][0]["text"])

Intended Use

This model is designed for:

  • Mental health chatbots
  • Therapy conversation simulation
  • Empathetic response generation
  • Research in AI-assisted counseling

Limitations

  • Not a replacement for professional therapy
  • May produce incorrect or inappropriate responses
  • Requires human oversight in production use
  • Limited to English language

Training Details

  • Training Method: LoRA fine-tuning
  • Epochs: 3
  • Learning Rate: 2e-4
  • Batch Size: 1 (gradient accumulation: 8)

License

Apache 2.0 (inherits from LLaMA 3.2 base model)

Downloads last month
19
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for E-motionAssistant/llama-3.2-3b-english-therapy-GGUF

Quantized
(439)
this model