distilbert-base-uncased-zeynepc-5dim
This model is a fine-tuned version of distilbert-base-uncased for personality classification based on the Five-Factor Model (FFM), also known as the Big Five Personality Traits. It predicts probabilities for five personality dimensions: Extraversion, Agreeableness, Neuroticism, Openness, and Conscientiousness.
Model description
The model uses the distilbert-base-uncased architecture as a base and is fine-tuned to classify text into five personality dimensions. Probabilities for each dimension are calculated using a sigmoid activation function, making it suitable for multi-label classification.
Key Features:
- Base Model: DistilBERT, a lightweight transformer model that balances performance and efficiency.
- Personality Dimensions: Outputs probabilities for:
- Extraversion
- Agreeableness
- Neuroticism
- Openness
- Conscientiousness
- Intended Task: Personality analysis and text classification.
Intended uses & limitations
Use Cases:
- Analyzing personality traits in text-based content.
- Understanding user characteristics for personalized recommendations.
- Research in psychology and linguistics.
Limitations:
- The model's predictions are based on the dataset used for training. Biases in the training data may affect its predictions.
- Fine-tuned for English text; performance on other languages may be suboptimal.
- Not suitable for high-stakes decision-making without additional verification.
Training and evaluation data
The model was fine-tuned on a custom dataset related to personality analysis. The dataset includes labeled examples corresponding to the Five-Factor Model traits. The evaluation set results will depend on the size and quality of this dataset.
Dataset details:
- Type: Text-based personality annotations.
- Preprocessing: Text was tokenized using the
distilbert-base-uncasedtokenizer with a maximum length of 512 tokens. - Evaluation Metrics: Accuracy and other relevant metrics specific to multi-label classification.
How to use
The model can be used with the Hugging Face transformers library as follows:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
import tensorflow as tf
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("zeynepcetin/distilbert-base-uncased-zeynepc-5dim")
model = TFAutoModelForSequenceClassification.from_pretrained("zeynepcetin/distilbert-base-uncased-zeynepc-5dim")
# Example text
text = "I enjoy working with others and find joy in social interactions."
# Tokenize and predict
inputs = tokenizer(text, truncation=True, padding=True, max_length=512, return_tensors="tf")
outputs = model(inputs)
probs = tf.nn.sigmoid(outputs.logits).numpy()
# Print probabilities for each personality trait
traits = ["Extraversion", "Agreeableness", "Neuroticism", "Openness", "Conscientiousness"]
for trait, prob in zip(traits, probs[0]):
print(f"{trait}: {prob:.2f}")
- Downloads last month
- 1
Model tree for zeynepcetin/distilbert-base-uncased-zeynepc-5dim
Base model
distilbert/distilbert-base-uncasedEvaluation results
- Accuracy on Custom Personality Datasetself-reportedAccuracy Value