YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Transformer-Based Social Media Sentiment Analysis

IEEE Conference Paper: Comprehensive Evaluation on Real Tweet Data

This repository contains the complete research paper, code, evaluation results, and figures for a social media sentiment analysis study comparing 4 transformer models on real social media data.


πŸ”‘ Key Finding: SST-2 Accuracy Does NOT Predict Tweet Performance

Model Params SST-2 Accuracy TweetEval 3-class Macro-F1
DistilBERT-SST2 66M 91.06% 36.53%
BERT-base-SST2 110M 92.43% 38.53%
Twitter-RoBERTa 125M 86.12% 72.40% βœ…
DeBERTa-v3-base 184M 96.44% βœ… 40.64%

DeBERTa-v3 achieves 96.44% on movie reviews but only 40.64% on real tweets. Twitter-RoBERTa, pre-trained on 124M tweets, achieves 72.40% β€” nearly 2x better on social media.


πŸ“„ Paper

πŸ“₯ Download IEEE PDF Paper (v2 with figures)

The paper includes:

  • 5 figures: confusion matrices, model comparison charts, dataset distributions, per-class F1 analysis
  • Evaluation on 12,284 real tweets from TweetEval benchmark
  • Honest analysis of the domain gap between formal text and social media
  • Comparison of 4 transformer architectures

Also available: LaTeX Source (IEEEtran class)


πŸ“ Repository Contents

β”œβ”€β”€ paper/
β”‚   β”œβ”€β”€ Social_Media_Sentiment_Analysis_IEEE_v2.pdf  # β˜… Updated IEEE PDF (6 pages, 5 figures)
β”‚   β”œβ”€β”€ Social_Media_Sentiment_Analysis_IEEE.pdf      # Original version
β”‚   └── ieee_sentiment_paper.tex                       # LaTeX source
β”œβ”€β”€ figures/
β”‚   β”œβ”€β”€ fig1_confusion_roberta.{png,pdf}               # Confusion matrix: Twitter-RoBERTa
β”‚   β”œβ”€β”€ fig2_confusion_deberta.{png,pdf}               # Confusion matrix: DeBERTa-v3
β”‚   β”œβ”€β”€ fig3_model_comparison.{png,pdf}                # Bar chart: SST-2 vs TweetEval
β”‚   β”œβ”€β”€ fig4_data_distribution.{png,pdf}               # Dataset class distributions
β”‚   └── fig5_per_class_f1.{png,pdf}                    # Per-class F1 comparison
β”œβ”€β”€ code/
β”‚   β”œβ”€β”€ comprehensive_eval.py                           # Full evaluation + figure generation
β”‚   β”œβ”€β”€ train_sentiment.py                              # Training pipeline
β”‚   β”œβ”€β”€ evaluate_models.py                              # Multi-model evaluation
β”‚   β”œβ”€β”€ eval_deberta.py                                 # DeBERTa-specific evaluation
β”‚   └── generate_pdf_v2.py                              # PDF generation with figures
β”œβ”€β”€ results/
β”‚   β”œβ”€β”€ comprehensive_results.json                      # TweetEval + SST-2 results
β”‚   └── eval_results.json                               # Initial SST-2 results
└── README.md

πŸ“Š Datasets Used

Dataset Domain Samples Classes Source
TweetEval Sentiment Real tweets 12,284 test 3 (neg/neutral/pos) cardiffnlp/tweet_eval
SST-2 Movie reviews 872 val 2 (neg/pos) stanfordnlp/sst2

Both are real data β€” TweetEval contains actual tweets from Twitter/X, SST-2 contains actual movie review sentences from Rotten Tomatoes.


πŸ”¬ What's Needed for IEEE Acceptance

The paper currently addresses most IEEE requirements. For a full conference submission, additionally needed:

  1. βœ… Figures β€” 5 figures included (confusion matrices, bar charts, distributions, per-class F1)
  2. βœ… Real data evaluation β€” TweetEval benchmark (12,284 real tweets)
  3. βœ… Multiple baselines β€” 4 models compared
  4. ⬜ Multi-seed runs β€” Run 3+ seeds and report mean Β± std
  5. ⬜ Ablation study β€” Learning rate sweep, layer freezing, sequence length effects
  6. ⬜ Statistical significance β€” McNemar test between model pairs
  7. ⬜ Two-column layout β€” LaTeX source uses IEEEtran (compile with pdflatex for proper format)
  8. ⬜ Fine-tune DeBERTa on TweetEval β€” Would likely close the domain gap significantly

πŸš€ Quick Start

# Best for social media sentiment:
from transformers import pipeline
classifier = pipeline("sentiment-analysis", 
                      model="cardiffnlp/twitter-roberta-base-sentiment-latest")
result = classifier("Love this new feature! #excited")
print(result)  # [{'label': 'positive', 'score': 0.97}]

# Best for formal text sentiment:
classifier_formal = pipeline("text-classification",
                             model="cliang1453/deberta-v3-base-sst2")
result = classifier_formal("This movie was absolutely brilliant")
print(result)  # [{'label': 'positive', 'score': 0.99}]

πŸ“ Citation

@inproceedings{vivan2026sentiment,
  title={Transformer-Based Social Media Sentiment Analysis: A Comprehensive Evaluation on Real Tweet Data},
  author={Vivan, Raj},
  year={2026},
  note={Available at: https://huggingface.co/rajvivan/social-media-sentiment-analysis-paper}
}

License

MIT License

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support