Fake News Detection Model
A RoBERTa-based text classification model for detecting fake news articles. This model was evaluated on the WELFake dataset as part of a final year project focused on misinformation detection using NLP.
Model Details
Model Description
- Developed by: Akan Brown
- Model type: RoBERTa (Robustly Optimized BERT Pretraining Approach)
- Language: English
- License: MIT
- Base model: jy46604790/Fake-News-Bert-Detect
- Fine-tuned from: FacebookAI/roberta-large
Model Sources
- Base Model Repository: https://huggingface.co/jy46604790/Fake-News-Bert-Detect
- Dataset: https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification
Uses
Direct Use
This model can be used directly for binary classification of news articles as either FAKE or REAL without any additional fine-tuning.
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Akan-Brown/fakenews-detector",
truncation=True,
max_length=512
)
result = classifier("Your news article text here")[0]
label = "REAL" if result["label"] == "LABEL_1" else "FAKE"
print(f"Prediction: {label} ({result['score']:.2%} confidence)")
Downstream Use
- Fake news detection APIs
- Browser extensions for real-time news verification
- Media literacy tools and platforms
- Academic research on misinformation
Out-of-Scope Use
- Non-English news articles
- Satire detection (satirical news may be misclassified as fake)
- Real-time social media posts (model was trained on article-length text)
- Opinion pieces and editorials
Label Mapping
| Label | Meaning |
|---|---|
LABEL_0 |
FAKE news |
LABEL_1 |
REAL news |
Bias, Risks, and Limitations
- The model was trained primarily on English-language news data and may not generalise well to other languages or dialects.
- The WELFake dataset has a bias toward US and UK political news — performance may vary on news from other regions.
- The model may struggle with satire, opinion pieces, or highly technical journalism.
- As with all ML models, this should not be used as the sole arbiter of whether a news article is fake or real.
Recommendations
This model should be used as a supplementary tool alongside human judgment and other fact-checking resources. It is not suitable for automated content moderation without human oversight.
Training and Evaluation Details
Dataset
The model was evaluated on the WELFake dataset, a benchmark dataset for fake news detection containing 72,134 news articles:
- Real news: sourced from Reuters, The New York Times, The Washington Post, and Guardian
- Fake news: sourced from various unreliable sources flagged by fact-checkers
- Split used: 80% train / 10% validation / 10% test
Evaluation Results on WELFake (1,000 sample test)
| Metric | Score |
|---|---|
| Accuracy | reported after evaluation |
| F1 Score (weighted) | reported after evaluation |
| Precision | reported after evaluation |
| Recall | reported after evaluation |
Replace the values above with your actual results from Cell 7 output.
How to Get Started
Basic prediction
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Akan-Brown/fakenews-detector",
truncation=True,
max_length=512
)
articles = [
"Federal Reserve raises interest rates by 0.25% amid inflation concerns.",
"SHOCKING: Government putting microchips in vaccines, whistleblower reveals!!!",
]
for article in articles:
result = classifier(article)[0]
label = "REAL ✅" if result["label"] == "LABEL_1" else "FAKE ❌"
print(f"{label} ({result['score']:.2%}) — {article[:60]}")
Using the Inference API
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.text_classification(
"Your news article text here",
model="Akan-Brown/fakenews-detector",
)
Project Context
This model was developed as part of a Final Year Project on automated fake news detection using transformer-based NLP models. The project explores the use of pre-trained language models for identifying misinformation in online news articles.
The full system includes:
- A fine-tuned RoBERTa classification model (this repo)
- A FastAPI backend deployed on Render
- A web frontend for real-time news verification
Citation
If you use this model in your research, please cite the WELFake dataset:
@article{verma2021WELFake,
title={WELFake: Word Embedding over Linguistic Features for Fake News Detection},
author={Verma, Parth and Agrawal, Priya},
journal={IEEE Transactions on Computational Social Systems},
year={2021}
}
Model Card Contact
Akan Brown — Final Year Project, 2026
- Downloads last month
- 15
Model tree for akanbrown/AI-Fake_News_Detection_Roberta
Base model
jy46604790/Fake-News-Bert-Detect