library_name: transformers
license: mit
base_model: xlm-roberta-base
tags:
- generated_from_trainer
- emotion-classification
- midwest-emo
- multilingual
metrics:
- accuracy
datasets:
- anggars/mbti-emotion
model-index:
- name: xlm-emotion
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: anggars/mbti-emotion
type: anggars/mbti-emotion
metrics:
- name: Accuracy
type: accuracy
value: 0.8876
xlm-emotion
This model is a fine-tuned version of xlm-roberta-base for Emotion Classification (28 labels). It is specifically trained to recognize emotional nuances in Midwest Emo and Math Rock lyrical styles.
To ensure high reliability and prevent the model from being biased toward majority classes (like Grief), the training was conducted on a balanced version of the anggars/mbti-emotion dataset using undersampling techniques.
Model description
- Model Type: XLM-RoBERTa Base
- Labels: 28 Emotion Categories
- Dataset: Balanced
anggars/mbti-emotion(max 500 samples per class combo) - Language: English & Indonesian
Training results
The following results were achieved on the evaluation set during the 3-epoch training process:
| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|---|---|---|---|---|
| 0.5297 | 1.0 | 5618 | 0.4876 | 0.8480 |
| 0.3340 | 2.0 | 11236 | 0.4059 | 0.8732 |
| 0.2116 | 3.0 | 16854 | 0.3954 | 0.8876 |
Intended uses & limitations
This model is designed for sentiment and emotion analysis in creative writing, specifically lyrics. Limitations: Since it's trained on a specific subgenre context, it may perform differently on general conversational text or technical documents.
Training procedure
Training hyperparameters
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
- num_epochs: 3
Framework versions
- Transformers 4.44.2
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.20.3