XLM-RoBERTa Burmese Pragmatics β€” Stage 3 (Full Pragmatic Input)

Fine-tuned version of xlm-roberta-base for Burmese politeness classification.
This is Stage 3 of a 3-stage ablation study on pragmatic classification in Burmese.

Task

Given a Burmese utterance with full pragmatic metadata, classify its politeness level into one of 6 classes:
neutral, polite, informal, professional, blunt, rude

Dataset

freococo/burmese-contextual-pragmatics
2,200 Burmese utterances covering 22 root meanings with pragmatic annotations.
Split: 70% train / 15% val / 15% test (seed=42)

Stage Description

Stage 3 β€” Full pragmatic input.
Input format: [register: X] [power: Y] [tone: Z] [utterance] </s> [context] </s> [instruction]
Adds explicit pragmatic metadata (register, power relation, tone) as text prefix.

Results (Test Set)

Metric Score
Accuracy 0.8394
Macro F1 0.8252
Weighted F1 0.8421
Loss 0.5390

Per-class F1

Class Precision Recall F1
blunt 0.83 0.71 0.77
informal 0.84 0.90 0.87
neutral 0.93 0.82 0.87
polite 0.66 0.86 0.74
professional 0.79 1.00 0.88
rude 0.90 0.75 0.82

Training

  • Base model: xlm-roberta-base
  • Epochs: 8
  • Learning rate: 2e-5
  • Batch size: 8
  • Weighted cross-entropy loss (handles class imbalance)
  • Best checkpoint selected by macro F1

Ablation Study

Stage Input Macro F1
Stage 1 Utterance only see annasus10/xlmr-burmese-pragmatics-stage1
Stage 2 + context + instruction see annasus10/xlmr-burmese-pragmatics-stage2
Stage 3 (this model) + register + power + tone 0.825

Citation

Group: AttentionIsAllUNeed β€” NLP Final Project

Downloads last month
94
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for annasus10/xlmr-burmese-pragmatics-stage3

Finetuned
(3887)
this model

Dataset used to train annasus10/xlmr-burmese-pragmatics-stage3

Space using annasus10/xlmr-burmese-pragmatics-stage3 1