LLaMA 3.2-3B English Therapy - GGUF Q4_K_M

Quantized GGUF version of LLaMA 3.2-3B fine-tuned for empathetic therapy conversations.

Model Details

Base Model: meta-llama/Llama-3.2-3B-Instruct
Fine-tuned on: 1,390 English therapy conversations
Quantization: Q4_K_M (4-bit)
File Size: ~1.8 GB (from ~6 GB original)
Format: GGUF (compatible with llama.cpp)

Performance Metrics

BLEU Score: 1.85
ROUGE-1: 0.29
ROUGE-L: 0.16
Perplexity: 15.97

Usage

With llama.cpp

# Download the model
wget https://huggingface.co/pavishanth-sujeevan/llama-3.2-3b-english-therapy-GGUF/resolve/main/model.gguf

# Run inference
./llama-cli -m model.gguf -p "I'm feeling anxious about my future" -n 200

With llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="model.gguf",
    n_ctx=2048,
    n_threads=4,
    n_gpu_layers=35
)

output = llm(
    "User: I'm feeling stressed about work.\nTherapist:",
    max_tokens=200,
    temperature=0.7,
    top_p=0.9
)

print(output["choices"][0]["text"])

With Streamlit

import streamlit as st
from llama_cpp import Llama

@st.cache_resource
def load_model():
    return Llama(model_path="model.gguf", n_ctx=2048, n_gpu_layers=35)

llm = load_model()
user_input = st.text_input("How are you feeling?")

if user_input:
    response = llm(f"User: {user_input}\nTherapist:", max_tokens=200)
    st.write(response["choices"][0]["text"])

Intended Use

This model is designed for:

Mental health chatbots
Therapy conversation simulation
Empathetic response generation
Research in AI-assisted counseling

Limitations

Not a replacement for professional therapy
May produce incorrect or inappropriate responses
Requires human oversight in production use
Limited to English language

Training Details

Training Method: LoRA fine-tuning
Epochs: 3
Learning Rate: 2e-4
Batch Size: 1 (gradient accumulation: 8)

License

Apache 2.0 (inherits from LLaMA 3.2 base model)

Downloads last month: 19

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for E-motionAssistant/llama-3.2-3b-english-therapy-GGUF

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(439)

this model