Qwen 2.5-3B Tamil Therapy - GGUF Q4_K_M

Quantized GGUF version of Qwen 2.5-3B fine-tuned for empathetic therapy conversations in Tamil.

Model Details

Base Model: Qwen/Qwen2.5-3B-Instruct
Fine-tuned on: Tamil therapy conversations
Quantization: Q4_K_M (4-bit)
File Size: ~1.8 GB (from ~6 GB original)
Format: GGUF (compatible with llama.cpp)
Language: Tamil (தமிழ்)

Usage

With llama-cpp-python

from llama_cpp import Llama

llm = Llama(
    model_path="model.gguf",
    n_ctx=2048,
    n_threads=4,
    n_gpu_layers=35
)

output = llm(
    "User: எனக்கு மன அழுத்தம் உள்ளது\nTherapist:",
    max_tokens=200,
    temperature=0.7,
    top_p=0.9
)

print(output["choices"][0]["text"])

Intended Use

This model is designed for:

Tamil-language mental health chatbots
Therapy conversation simulation
Empathetic response generation in Tamil
Research in multilingual AI-assisted counseling

Limitations

Not a replacement for professional therapy
May produce incorrect or inappropriate responses
Requires human oversight in production use
Limited to Tamil language

License

Apache 2.0 (inherits from Qwen 2.5 base model)

Downloads last month: 20

GGUF

Model size

3B params

Architecture

qwen2

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for E-motionAssistant/qwen-tamil-therapy-GGUF

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Quantized

(203)

this model