--- language: - hu license: mit tags: - sentiment-analysis - xlm-roberta - hungarian - text-classification datasets: - custom metrics: - accuracy - f1 pipeline_tag: text-classification --- # Sentiment Fine-tuned [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) for **Hungarian sentiment classification**. ## Model Details - **Base model**: `xlm-roberta-base` - **Task**: 3-class sentiment classification (negative / neutral / positive) - **Language**: Hungarian - **Training data**: ~37K sentences (stratified split from ~46K total) - **Class weighting**: Balanced weights applied during training to handle class imbalance ## Labels | Label ID | Label | Description | |----------|-------|-------------| | 0 | negative | Negative sentiment | | 1 | neutral | Neutral sentiment | | 2 | positive | Positive sentiment | ## Overall Results | Metric | Value | |--------|-------| | Accuracy | 0.8442320225939605 | | F1 (macro) | 0.8387464047460437 | | F1 (weighted) | 0.8435908941071462 | ## Per-Language Results | Language | Samples | Accuracy | F1 (macro) | F1 (weighted) | |----------|---------|----------|------------|---------------| | hun | 4603 | 0.8442 | 0.8387 | 0.8436 | ## Usage ```python from transformers import pipeline classifier = pipeline("text-classification", model="ringorsolya/Sentiment") classifier("Ez egy fantasztikus nap!") # [{'label': 'positive', 'score': 0.95}] classifier("Szörnyű volt a kiszolgálás.") # [{'label': 'negative', 'score': 0.92}] ``` ## Training Details - **Epochs**: 5 - **Batch size**: 32 - **Learning rate**: 2e-05 - **Weight decay**: 0.01 - **Warmup ratio**: 0.1 - **Max sequence length**: 128 - **FP16**: True - **Class weights**: [0.8114, 1.1219, 1.1413]