AI-Fake_News_Detection_Roberta

Fake News Detection Model

A RoBERTa-based text classification model for detecting fake news articles. This model was evaluated on the WELFake dataset as part of a final year project focused on misinformation detection using NLP.

Model Details

Model Description

Developed by: Akan Brown
Model type: RoBERTa (Robustly Optimized BERT Pretraining Approach)
Language: English
License: MIT
Base model: jy46604790/Fake-News-Bert-Detect
Fine-tuned from: FacebookAI/roberta-large

Model Sources

Base Model Repository: https://huggingface.co/jy46604790/Fake-News-Bert-Detect
Dataset: https://www.kaggle.com/datasets/saurabhshahane/fake-news-classification

Uses

Direct Use

This model can be used directly for binary classification of news articles as either FAKE or REAL without any additional fine-tuning.

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="Akan-Brown/fakenews-detector",
    truncation=True,
    max_length=512
)

result = classifier("Your news article text here")[0]
label = "REAL" if result["label"] == "LABEL_1" else "FAKE"
print(f"Prediction: {label} ({result['score']:.2%} confidence)")

Downstream Use

Fake news detection APIs
Browser extensions for real-time news verification
Media literacy tools and platforms
Academic research on misinformation

Out-of-Scope Use

Non-English news articles
Satire detection (satirical news may be misclassified as fake)
Real-time social media posts (model was trained on article-length text)
Opinion pieces and editorials

Label Mapping

Label	Meaning
`LABEL_0`	FAKE news
`LABEL_1`	REAL news

Bias, Risks, and Limitations

The model was trained primarily on English-language news data and may not generalise well to other languages or dialects.
The WELFake dataset has a bias toward US and UK political news — performance may vary on news from other regions.
The model may struggle with satire, opinion pieces, or highly technical journalism.
As with all ML models, this should not be used as the sole arbiter of whether a news article is fake or real.

Recommendations

This model should be used as a supplementary tool alongside human judgment and other fact-checking resources. It is not suitable for automated content moderation without human oversight.

Training and Evaluation Details

Dataset

The model was evaluated on the WELFake dataset, a benchmark dataset for fake news detection containing 72,134 news articles:

Real news: sourced from Reuters, The New York Times, The Washington Post, and Guardian
Fake news: sourced from various unreliable sources flagged by fact-checkers
Split used: 80% train / 10% validation / 10% test

Evaluation Results on WELFake (1,000 sample test)

Metric	Score
Accuracy	reported after evaluation
F1 Score (weighted)	reported after evaluation
Precision	reported after evaluation
Recall	reported after evaluation

Replace the values above with your actual results from Cell 7 output.

How to Get Started

Basic prediction

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="Akan-Brown/fakenews-detector",
    truncation=True,
    max_length=512
)

articles = [
    "Federal Reserve raises interest rates by 0.25% amid inflation concerns.",
    "SHOCKING: Government putting microchips in vaccines, whistleblower reveals!!!",
]

for article in articles:
    result = classifier(article)[0]
    label = "REAL ✅" if result["label"] == "LABEL_1" else "FAKE ❌"
    print(f"{label} ({result['score']:.2%}) — {article[:60]}")

Using the Inference API

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.text_classification(
    "Your news article text here",
    model="Akan-Brown/fakenews-detector",
)

Project Context

This model was developed as part of a Final Year Project on automated fake news detection using transformer-based NLP models. The project explores the use of pre-trained language models for identifying misinformation in online news articles.

The full system includes:

A fine-tuned RoBERTa classification model (this repo)
A FastAPI backend deployed on Render
A web frontend for real-time news verification

Citation

If you use this model in your research, please cite the WELFake dataset:

@article{verma2021WELFake,
  title={WELFake: Word Embedding over Linguistic Features for Fake News Detection},
  author={Verma, Parth and Agrawal, Priya},
  journal={IEEE Transactions on Computational Social Systems},
  year={2021}
}

Model Card Contact

Akan Brown — Final Year Project, 2026

Downloads last month: 15

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for akanbrown/AI-Fake_News_Detection_Roberta

Base model

jy46604790/Fake-News-Bert-Detect

Finetuned

(2)

this model