TruthLens – Fake News Detection Model (RoBERTa)

TruthLens is an advanced RoBERTa-based Fake News Detection Transformer model fine-tuned by Divyanshu Chauhan.
This model is capable of analyzing news articles and social media text and classifying them as REAL or FAKE along with a confidence score.

It is trained on the WELFake Dataset containing over 72,000 labeled news samples, balanced and cleaned for high-accuracy text classification.

This model powers the TruthLens Flask Application, capable of real-time misinformation detection.

🧠 Model Details

Model Developer

Developed by: Divyanshu Chauhan
Specialization: Artificial Intelligence & Machine Learning
Project Purpose: Real-time misinformation detection using modern NLP

Model Type

Architecture: RoBERTa-base (Transformer)
Task: Binary Text Classification (Fake vs Real)
Framework: PyTorch + HuggingFace Transformers
Fine-tuned Model: This model

Model Capabilities

Detects whether a news text is REAL or FAKE
Provides a probability/confidence score
Works on short and long news text
Handles social media misinformation patterns

Training Details

✔ Dataset

Name: WELFake
Samples: ~72,000
Balanced: Yes (after synthetic balancing)
Labels:
- 0 = Fake
- 1 = Real

Data Preprocessing Performed

Removed null entries
Merged title + text → combined_text
Cleaned URLs, numbers, special characters
Lowercasing and stopword removal
Train/Test split
Used original text for Transformer fine-tuning (Transformers do their own tokenization)

Training Setup

Base Model: roberta-base
Epochs: 3
Max Length: 256 tokens
Batch Size: 8
Learning Rate: 2e-5
Optimizer: AdamW
Loss: Cross-Entropy
Hardware: GPU (Google Colab / Tesla T4)

Model Performance

Metric	Score
Accuracy	~95%
F1 Score	~95%
Precision	~95%
Recall	~95%

This makes the model suitable for production-level applications.

Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_path = "divyanshu-chauhan-7786/fake-news-roberta"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)

clf = pipeline("text-classification", model=model, tokenizer=tokenizer)

text = "Breaking: Government announces new education reforms!"
result = clf(text, truncation=True, max_length=256)

print(result)

Downloads last month: 52

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for divyanshu-chauhan-7786/fake-news-roberta

Base model

FacebookAI/roberta-base

Finetuned

(2205)

this model