Asmatullah-AI-Engineer's picture
Upload README.md with huggingface_hub
4ba1fdf verified
---
language: en
license: apache-2.0
tags:
- text-classification
- sentiment-analysis
- distilbert
- fine-tuned
datasets:
- imdb
metrics:
- accuracy
- f1
---
# DistilBERT IMDb Sentiment Classifier
A fine-tuned DistilBERT model for binary sentiment analysis on movie reviews.
## Model Description
This model was fine-tuned from distilbert-base-uncased on 5,000 IMDb movie
reviews for 3 epochs. It classifies text as POSITIVE or NEGATIVE sentiment.
## Training Data
- Source: IMDb Large Movie Review Dataset (stored in SQLite, queried with pandas)
- Train: 5,000 samples | Validation: 1,000 samples
- Label balance: approximately 50% positive, 50% negative
## Evaluation Results
| Metric | Score |
|----------|--------|
| Accuracy | 88.4% | <- replace with your actual score
| F1 Score | 0.893 | <- replace with your actual score
## Baseline Comparison
| Model | Accuracy |
|--------------------------------|----------|
| TF-IDF + Logistic Regression | 86.4% |
| DistilBERT (this model) | 92.3% |
## Intended Use
Product review analysis, feedback classification, general English sentiment tasks.
## Limitations and Bias
- Trained only on English movie reviews performance on other domains may vary
- May not handle Urdu, Roman Urdu, or code-switched text well
- Sarcasm with no obvious negative words may be misclassified
- Very short texts (under 5 words) have lower confidence scores
## How to Use
python
from transformers import pipeline
classifier = pipeline('text-classification', model='YOUR-USERNAME/distilbert-imdb-sentiment')
result = classifier('This movie was absolutely incredible!')
# Output: [{'label': 'POSITIVE', 'score': 0.997}]