Fake News Detection Model

A RoBERTa-based text classification model for detecting fake news articles. This model was evaluated on the WELFake dataset as part of a final year project focused on misinformation detection using NLP.

Model Details

Model Description

  • Developed by: Akan Brown
  • Model type: RoBERTa (Robustly Optimized BERT Pretraining Approach)
  • Language: English
  • License: MIT
  • Base model: jy46604790/Fake-News-Bert-Detect
  • Fine-tuned from: FacebookAI/roberta-large

Model Sources


Uses

Direct Use

This model can be used directly for binary classification of news articles as either FAKE or REAL without any additional fine-tuning.

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="Akan-Brown/fakenews-detector",
    truncation=True,
    max_length=512
)

result = classifier("Your news article text here")[0]
label = "REAL" if result["label"] == "LABEL_1" else "FAKE"
print(f"Prediction: {label} ({result['score']:.2%} confidence)")

Downstream Use

  • Fake news detection APIs
  • Browser extensions for real-time news verification
  • Media literacy tools and platforms
  • Academic research on misinformation

Out-of-Scope Use

  • Non-English news articles
  • Satire detection (satirical news may be misclassified as fake)
  • Real-time social media posts (model was trained on article-length text)
  • Opinion pieces and editorials

Label Mapping

Label Meaning
LABEL_0 FAKE news
LABEL_1 REAL news

Bias, Risks, and Limitations

  • The model was trained primarily on English-language news data and may not generalise well to other languages or dialects.
  • The WELFake dataset has a bias toward US and UK political news — performance may vary on news from other regions.
  • The model may struggle with satire, opinion pieces, or highly technical journalism.
  • As with all ML models, this should not be used as the sole arbiter of whether a news article is fake or real.

Recommendations

This model should be used as a supplementary tool alongside human judgment and other fact-checking resources. It is not suitable for automated content moderation without human oversight.


Training and Evaluation Details

Dataset

The model was evaluated on the WELFake dataset, a benchmark dataset for fake news detection containing 72,134 news articles:

  • Real news: sourced from Reuters, The New York Times, The Washington Post, and Guardian
  • Fake news: sourced from various unreliable sources flagged by fact-checkers
  • Split used: 80% train / 10% validation / 10% test

Evaluation Results on WELFake (1,000 sample test)

Metric Score
Accuracy reported after evaluation
F1 Score (weighted) reported after evaluation
Precision reported after evaluation
Recall reported after evaluation

Replace the values above with your actual results from Cell 7 output.


How to Get Started

Basic prediction

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="Akan-Brown/fakenews-detector",
    truncation=True,
    max_length=512
)

articles = [
    "Federal Reserve raises interest rates by 0.25% amid inflation concerns.",
    "SHOCKING: Government putting microchips in vaccines, whistleblower reveals!!!",
]

for article in articles:
    result = classifier(article)[0]
    label = "REAL ✅" if result["label"] == "LABEL_1" else "FAKE ❌"
    print(f"{label} ({result['score']:.2%}) — {article[:60]}")

Using the Inference API

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.text_classification(
    "Your news article text here",
    model="Akan-Brown/fakenews-detector",
)

Project Context

This model was developed as part of a Final Year Project on automated fake news detection using transformer-based NLP models. The project explores the use of pre-trained language models for identifying misinformation in online news articles.

The full system includes:

  • A fine-tuned RoBERTa classification model (this repo)
  • A FastAPI backend deployed on Render
  • A web frontend for real-time news verification

Citation

If you use this model in your research, please cite the WELFake dataset:

@article{verma2021WELFake,
  title={WELFake: Word Embedding over Linguistic Features for Fake News Detection},
  author={Verma, Parth and Agrawal, Priya},
  journal={IEEE Transactions on Computational Social Systems},
  year={2021}
}

Model Card Contact

Akan Brown — Final Year Project, 2026

Downloads last month
15
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for akanbrown/AI-Fake_News_Detection_Roberta

Finetuned
(2)
this model