XLM-RoBERTa Burmese Pragmatics β€” Stage 2 (Utterance + Context)

Fine-tuned version of xlm-roberta-base for Burmese politeness classification.
This is Stage 2 of a 3-stage ablation study on pragmatic classification in Burmese.

Task

Given a Burmese utterance with social context, classify its politeness level into one of 6 classes:
neutral, polite, informal, professional, blunt, rude

Dataset

freococo/burmese-contextual-pragmatics
2,200 Burmese utterances covering 22 root meanings with pragmatic annotations.
Split: 70% train / 15% val / 15% test (seed=42)

Stage Description

Stage 2 β€” Utterance + context + instruction.
Input format: [utterance] </s> [context] </s> [instruction]
Adds social situational context to the raw utterance.

Results (Test Set)

Metric Score
Accuracy 0.7485
Macro F1 0.7056
Weighted F1 0.7529
Loss 0.7066

Per-class F1

Class Precision Recall F1
blunt 0.60 0.64 0.62
informal 0.69 0.76 0.73
neutral 0.89 0.73 0.80
polite 0.58 0.80 0.68
professional 0.83 1.00 0.91
rude 0.50 0.50 0.50

Training

  • Base model: xlm-roberta-base
  • Epochs: 8
  • Learning rate: 2e-5
  • Batch size: 16
  • Weighted cross-entropy loss (handles class imbalance)
  • Best checkpoint selected by macro F1

Ablation Study

Stage Input Macro F1
Stage 1 Utterance only see annasus10/xlmr-burmese-pragmatics-stage1
Stage 2 (this model) + context + instruction 0.706
Stage 3 + register + power + tone see annasus10/xlmr-burmese-pragmatics-stage3

Citation

Group: AttentionIsAllUNeed β€” NLP Final Project

Downloads last month
94
Safetensors
Model size
0.3B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for annasus10/xlmr-burmese-pragmatics-stage2

Finetuned
(3893)
this model

Dataset used to train annasus10/xlmr-burmese-pragmatics-stage2

Space using annasus10/xlmr-burmese-pragmatics-stage2 1