XLM-RoBERTa Burmese Pragmatics — Stage 2 (Utterance + Context)

Fine-tuned version of xlm-roberta-base for Burmese politeness classification.
This is Stage 2 of a 3-stage ablation study on pragmatic classification in Burmese.

Task

Given a Burmese utterance with social context, classify its politeness level into one of 6 classes:
neutral, polite, informal, professional, blunt, rude

Dataset

freococo/burmese-contextual-pragmatics
2,200 Burmese utterances covering 22 root meanings with pragmatic annotations.
Split: 70% train / 15% val / 15% test (seed=42)

Stage Description

Stage 2 — Utterance + context + instruction.
Input format: [utterance] </s> [context] </s> [instruction]
Adds social situational context to the raw utterance.

Results (Test Set)

Metric	Score
Accuracy	0.7485
Macro F1	0.7056
Weighted F1	0.7529
Loss	0.7066

Per-class F1

Class	Precision	Recall	F1
blunt	0.60	0.64	0.62
informal	0.69	0.76	0.73
neutral	0.89	0.73	0.80
polite	0.58	0.80	0.68
professional	0.83	1.00	0.91
rude	0.50	0.50	0.50

Training

Base model: xlm-roberta-base
Epochs: 8
Learning rate: 2e-5
Batch size: 16
Weighted cross-entropy loss (handles class imbalance)
Best checkpoint selected by macro F1

Ablation Study

Stage	Input	Macro F1
Stage 1	Utterance only	see annasus10/xlmr-burmese-pragmatics-stage1
Stage 2 (this model)	+ context + instruction	0.706
Stage 3	+ register + power + tone	see annasus10/xlmr-burmese-pragmatics-stage3

Citation

Group: AttentionIsAllUNeed — NLP Final Project

Downloads last month: 94

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for annasus10/xlmr-burmese-pragmatics-stage2

Base model

FacebookAI/xlm-roberta-base

Finetuned

(3893)

this model

annasus10
/

xlmr-burmese-pragmatics-stage2