Fine-Tuned Llama-3.2-3B: Psychology

This is a fine-tuned Llama model on a psychology textbook dataset. It is designed to provide accurate and contextually relevant responses to psychology-related queries.

Model Details

Model Description

Base Model: meta-llama/Llama-3.2-3B-Instruct
Fine-Tuning Method: LoRA (Low-Rank Adaptation)
LoRA Config: Rank=32, Alpha=32, Dropout=0.1
Dataset: burgerbee/psychology_textbook
Task: Causal Language Modeling / Instruction Following

Direct Use

This model is intended for direct use in generating psychology-related content, answering questions, and providing educational insights.

Downstream Use

The model can be fine-tuned further for specific psychology subdomains or integrated into educational tools and mental health applications.

Out-of-Scope Use

This model is not intended for clinical diagnosis or treatment recommendations. It should not be used as a substitute for professional mental health advice.

Bias, Risks, and Limitations

The model may inherit biases present in the training data. It is recommended to evaluate its outputs critically, especially in sensitive contexts.

Recommendations

Users should be aware of the model's limitations and verify its outputs with authoritative sources when necessary.

How to Get Started with the Model

Use the following code to load and use the model:

from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
from peft import PeftModel
import torch
import warnings
import logging
from threading import Thread

# Suppress warnings
warnings.filterwarnings("ignore")

# Suppress tqdm progress bar
from transformers.utils import logging as hf_logging
hf_logging.set_verbosity_error()

# Suppress Accelerate and PEFT warnings
logging.getLogger("peft").setLevel(logging.ERROR)
logging.getLogger("bitsandbytes").setLevel(logging.ERROR)

# Load base model
device = 'cpu'
model_name = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map=device,
    torch_dtype=torch.float16
)

# Load fine-tuned LoRA weights
lora_model_name = "samuelchazy/fine-tuned-llama-3.2-3B-mlx-psychology"
model = PeftModel.from_pretrained(model, lora_model_name)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Ensure padding and eos token are set
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

print("Model loaded successfully!")

# Define the prompt
prompt = "What is the role of serotonin in mood regulation?"

# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt").to(device)

# Initialize the streamer
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

# Launch streaming in a separate thread
thread = Thread(target=model.generate, kwargs={
    "inputs": inputs.input_ids,
    "max_length": 768,
    "streamer": streamer,
    "do_sample": True,
    "temperature": 0.5,
})
thread.start()

# Read and stream output
print("Generating response...\n")
for token in streamer:
    print(token, end="", flush=True)

Downloads last month: 1

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support