File size: 1,719 Bytes
4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf 4ba1fdf a691bbf | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
language: en
license: apache-2.0
tags:
- text-classification
- sentiment-analysis
- distilbert
- fine-tuned
datasets:
- imdb
metrics:
- accuracy
- f1
---
# DistilBERT IMDb Sentiment Classifier
A fine-tuned DistilBERT model for binary sentiment analysis on movie reviews.
## Model Description
This model was fine-tuned from distilbert-base-uncased on 5,000 IMDb movie
reviews for 3 epochs. It classifies text as POSITIVE or NEGATIVE sentiment.
## Training Data
- Source: IMDb Large Movie Review Dataset (stored in SQLite, queried with pandas)
- Train: 5,000 samples | Validation: 1,000 samples
- Label balance: approximately 50% positive, 50% negative
## Evaluation Results
| Metric | Score |
|----------|--------|
| Accuracy | 88.4% | <- replace with your actual score
| F1 Score | 0.893 | <- replace with your actual score
## Baseline Comparison
| Model | Accuracy |
|--------------------------------|----------|
| TF-IDF + Logistic Regression | 86.4% |
| DistilBERT (this model) | 92.3% |
## Intended Use
Product review analysis, feedback classification, general English sentiment tasks.
## Limitations and Bias
- Trained only on English movie reviews performance on other domains may vary
- May not handle Urdu, Roman Urdu, or code-switched text well
- Sarcasm with no obvious negative words may be misclassified
- Very short texts (under 5 words) have lower confidence scores
## How to Use
python
from transformers import pipeline
classifier = pipeline('text-classification', model='YOUR-USERNAME/distilbert-imdb-sentiment')
result = classifier('This movie was absolutely incredible!')
# Output: [{'label': 'POSITIVE', 'score': 0.997}]
|