Transformer-Based Social Media Sentiment Analysis

IEEE Conference Paper: Comprehensive Evaluation on Real Tweet Data

This repository contains the complete research paper, code, evaluation results, and figures for a social media sentiment analysis study comparing 4 transformer models on real social media data.

🔑 Key Finding: SST-2 Accuracy Does NOT Predict Tweet Performance

Model	Params	SST-2 Accuracy	TweetEval 3-class Macro-F1
DistilBERT-SST2	66M	91.06%	36.53%
BERT-base-SST2	110M	92.43%	38.53%
Twitter-RoBERTa	125M	86.12%	72.40% ✅
DeBERTa-v3-base	184M	96.44% ✅	40.64%

DeBERTa-v3 achieves 96.44% on movie reviews but only 40.64% on real tweets. Twitter-RoBERTa, pre-trained on 124M tweets, achieves 72.40% — nearly 2x better on social media.

📄 Paper

📥 Download IEEE PDF Paper (v2 with figures)

The paper includes:

5 figures: confusion matrices, model comparison charts, dataset distributions, per-class F1 analysis
Evaluation on 12,284 real tweets from TweetEval benchmark
Honest analysis of the domain gap between formal text and social media
Comparison of 4 transformer architectures

Also available: LaTeX Source (IEEEtran class)

📁 Repository Contents

├── paper/
│   ├── Social_Media_Sentiment_Analysis_IEEE_v2.pdf  # ★ Updated IEEE PDF (6 pages, 5 figures)
│   ├── Social_Media_Sentiment_Analysis_IEEE.pdf      # Original version
│   └── ieee_sentiment_paper.tex                       # LaTeX source
├── figures/
│   ├── fig1_confusion_roberta.{png,pdf}               # Confusion matrix: Twitter-RoBERTa
│   ├── fig2_confusion_deberta.{png,pdf}               # Confusion matrix: DeBERTa-v3
│   ├── fig3_model_comparison.{png,pdf}                # Bar chart: SST-2 vs TweetEval
│   ├── fig4_data_distribution.{png,pdf}               # Dataset class distributions
│   └── fig5_per_class_f1.{png,pdf}                    # Per-class F1 comparison
├── code/
│   ├── comprehensive_eval.py                           # Full evaluation + figure generation
│   ├── train_sentiment.py                              # Training pipeline
│   ├── evaluate_models.py                              # Multi-model evaluation
│   ├── eval_deberta.py                                 # DeBERTa-specific evaluation
│   └── generate_pdf_v2.py                              # PDF generation with figures
├── results/
│   ├── comprehensive_results.json                      # TweetEval + SST-2 results
│   └── eval_results.json                               # Initial SST-2 results
└── README.md

📊 Datasets Used

Dataset	Domain	Samples	Classes	Source
TweetEval Sentiment	Real tweets	12,284 test	3 (neg/neutral/pos)	cardiffnlp/tweet_eval
SST-2	Movie reviews	872 val	2 (neg/pos)	stanfordnlp/sst2

Both are real data — TweetEval contains actual tweets from Twitter/X, SST-2 contains actual movie review sentences from Rotten Tomatoes.

🔬 What's Needed for IEEE Acceptance

The paper currently addresses most IEEE requirements. For a full conference submission, additionally needed:

✅ Figures — 5 figures included (confusion matrices, bar charts, distributions, per-class F1)
✅ Real data evaluation — TweetEval benchmark (12,284 real tweets)
✅ Multiple baselines — 4 models compared
⬜ Multi-seed runs — Run 3+ seeds and report mean ± std
⬜ Ablation study — Learning rate sweep, layer freezing, sequence length effects
⬜ Statistical significance — McNemar test between model pairs
⬜ Two-column layout — LaTeX source uses IEEEtran (compile with pdflatex for proper format)
⬜ Fine-tune DeBERTa on TweetEval — Would likely close the domain gap significantly

🚀 Quick Start

# Best for social media sentiment:
from transformers import pipeline
classifier = pipeline("sentiment-analysis", 
                      model="cardiffnlp/twitter-roberta-base-sentiment-latest")
result = classifier("Love this new feature! #excited")
print(result)  # [{'label': 'positive', 'score': 0.97}]

# Best for formal text sentiment:
classifier_formal = pipeline("text-classification",
                             model="cliang1453/deberta-v3-base-sst2")
result = classifier_formal("This movie was absolutely brilliant")
print(result)  # [{'label': 'positive', 'score': 0.99}]

📝 Citation

@inproceedings{vivan2026sentiment,
  title={Transformer-Based Social Media Sentiment Analysis: A Comprehensive Evaluation on Real Tweet Data},
  author={Vivan, Raj},
  year={2026},
  note={Available at: https://huggingface.co/rajvivan/social-media-sentiment-analysis-paper}
}

License

MIT License

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support