NBA Press Conference Sentiment - Fine-tuned RoBERTa
A RoBERTa model fine-tuned for 3-class sentiment analysis on NBA playoff press conference transcripts.
Model Description
Base model: cardiffnlp/twitter-roberta-base-sentiment
Fine-tuned on 2,050 NBA press conference speaker turns (50 hand-labeled seed turns + 2,000 GPT-4o-mini weak labels), covering Conference Finals and NBA Finals transcripts from 2013-2022 (2,790 transcripts, 23,166 speaker turns total).
Labels: NEGATIVE (0), NEUTRAL (1), POSITIVE (2)
Performance
Evaluated on a 50-turn hand-labeled seed set:
| Model | Accuracy | Macro F1 |
|---|---|---|
| This model (fine-tuned) | 92% | 0.932 |
| Twitter RoBERTa (base, no fine-tune) | 54% | 0.467 |
| DistilBERT SST-2 | 52% | 0.380 |
| FinBERT | 34% | 0.288 |
Fine-tuning closed a +38 percentage point gap over the best off-the-shelf baseline. General-purpose sentiment models fail on sports language because athletes and coaches systematically frame losses in positive terms ("we competed hard", "we'll make adjustments") rather than expressing raw negativity.
Training Details
- Base model:
cardiffnlp/twitter-roberta-base-sentiment - Training data: 2,050 labeled speaker turns (80/20 train/val split)
- Weak labeling: GPT-4o-mini with sports-specific 3-class definitions, batched 20/call
- Framework: Hugging Face
Trainer - Epochs: 5 (early stopping patience=2; best checkpoint at epoch 4)
- Learning rate: 2e-5 with linear warmup (10%)
- Batch size: 16
- Experiment tracking: MLflow
Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="EgeDenizPekel/nba-press-sentiment-roberta"
)
classifier("We competed hard tonight. We'll make some adjustments and come back stronger.")
# [{'label': 'POSITIVE', 'score': 0.87}]
classifier("We got killed out there. That was embarrassing.")
# [{'label': 'NEGATIVE', 'score': 0.94}]
Research Context
Built as part of an end-to-end NLP portfolio project investigating whether post-game press conference sentiment correlates with NBA playoff outcomes.
Key finding: No statistically significant correlation between post-game sentiment and point differential (r=-0.088, p=0.30, n=141 games). Press conference framing is strategically managed and does not leak game-level performance.
Full project: press-conference-sentiment-analyzer
- Downloads last month
- 5
Model tree for EgeDenizPekel/nba-press-sentiment-roberta
Base model
cardiffnlp/twitter-roberta-base-sentiment