distilbert-base-uncased-zeynepc-5dim

This model is a fine-tuned version of distilbert-base-uncased for personality classification based on the Five-Factor Model (FFM), also known as the Big Five Personality Traits. It predicts probabilities for five personality dimensions: Extraversion, Agreeableness, Neuroticism, Openness, and Conscientiousness.

Model description

The model uses the distilbert-base-uncased architecture as a base and is fine-tuned to classify text into five personality dimensions. Probabilities for each dimension are calculated using a sigmoid activation function, making it suitable for multi-label classification.

Key Features:

Base Model: DistilBERT, a lightweight transformer model that balances performance and efficiency.
Personality Dimensions: Outputs probabilities for:
- Extraversion
- Agreeableness
- Neuroticism
- Openness
- Conscientiousness
Intended Task: Personality analysis and text classification.

Intended uses & limitations

Use Cases:

Analyzing personality traits in text-based content.
Understanding user characteristics for personalized recommendations.
Research in psychology and linguistics.

Limitations:

The model's predictions are based on the dataset used for training. Biases in the training data may affect its predictions.
Fine-tuned for English text; performance on other languages may be suboptimal.
Not suitable for high-stakes decision-making without additional verification.

Training and evaluation data

The model was fine-tuned on a custom dataset related to personality analysis. The dataset includes labeled examples corresponding to the Five-Factor Model traits. The evaluation set results will depend on the size and quality of this dataset.

Dataset details:

Type: Text-based personality annotations.
Preprocessing: Text was tokenized using the distilbert-base-uncased tokenizer with a maximum length of 512 tokens.
Evaluation Metrics: Accuracy and other relevant metrics specific to multi-label classification.

How to use

The model can be used with the Hugging Face transformers library as follows:

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
import tensorflow as tf

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("zeynepcetin/distilbert-base-uncased-zeynepc-5dim")
model = TFAutoModelForSequenceClassification.from_pretrained("zeynepcetin/distilbert-base-uncased-zeynepc-5dim")

# Example text
text = "I enjoy working with others and find joy in social interactions."

# Tokenize and predict
inputs = tokenizer(text, truncation=True, padding=True, max_length=512, return_tensors="tf")
outputs = model(inputs)
probs = tf.nn.sigmoid(outputs.logits).numpy()

# Print probabilities for each personality trait
traits = ["Extraversion", "Agreeableness", "Neuroticism", "Openness", "Conscientiousness"]
for trait, prob in zip(traits, probs[0]):
    print(f"{trait}: {prob:.2f}")

Downloads last month: 1

Model tree for zeynepcetin/distilbert-base-uncased-zeynepc-5dim

Base model

distilbert/distilbert-base-uncased

Finetuned

(11359)

this model

Evaluation results

Accuracy on Custom Personality Dataset
self-reported

Accuracy Value