NBA Press Conference Sentiment - Fine-tuned RoBERTa

A RoBERTa model fine-tuned for 3-class sentiment analysis on NBA playoff press conference transcripts.

Model Description

Base model: cardiffnlp/twitter-roberta-base-sentiment

Fine-tuned on 2,050 NBA press conference speaker turns (50 hand-labeled seed turns + 2,000 GPT-4o-mini weak labels), covering Conference Finals and NBA Finals transcripts from 2013-2022 (2,790 transcripts, 23,166 speaker turns total).

Labels: NEGATIVE (0), NEUTRAL (1), POSITIVE (2)

Performance

Evaluated on a 50-turn hand-labeled seed set:

Model Accuracy Macro F1
This model (fine-tuned) 92% 0.932
Twitter RoBERTa (base, no fine-tune) 54% 0.467
DistilBERT SST-2 52% 0.380
FinBERT 34% 0.288

Fine-tuning closed a +38 percentage point gap over the best off-the-shelf baseline. General-purpose sentiment models fail on sports language because athletes and coaches systematically frame losses in positive terms ("we competed hard", "we'll make adjustments") rather than expressing raw negativity.

Training Details

  • Base model: cardiffnlp/twitter-roberta-base-sentiment
  • Training data: 2,050 labeled speaker turns (80/20 train/val split)
  • Weak labeling: GPT-4o-mini with sports-specific 3-class definitions, batched 20/call
  • Framework: Hugging Face Trainer
  • Epochs: 5 (early stopping patience=2; best checkpoint at epoch 4)
  • Learning rate: 2e-5 with linear warmup (10%)
  • Batch size: 16
  • Experiment tracking: MLflow

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="EgeDenizPekel/nba-press-sentiment-roberta"
)

classifier("We competed hard tonight. We'll make some adjustments and come back stronger.")
# [{'label': 'POSITIVE', 'score': 0.87}]

classifier("We got killed out there. That was embarrassing.")
# [{'label': 'NEGATIVE', 'score': 0.94}]

Research Context

Built as part of an end-to-end NLP portfolio project investigating whether post-game press conference sentiment correlates with NBA playoff outcomes.

Key finding: No statistically significant correlation between post-game sentiment and point differential (r=-0.088, p=0.30, n=141 games). Press conference framing is strategically managed and does not leak game-level performance.

Full project: press-conference-sentiment-analyzer

Downloads last month
5
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EgeDenizPekel/nba-press-sentiment-roberta

Finetuned
(60)
this model