TruthLens – Fake News Detection Model (RoBERTa)

TruthLens is an advanced RoBERTa-based Fake News Detection Transformer model fine-tuned by Divyanshu Chauhan.
This model is capable of analyzing news articles and social media text and classifying them as REAL or FAKE along with a confidence score.

It is trained on the WELFake Dataset containing over 72,000 labeled news samples, balanced and cleaned for high-accuracy text classification.

This model powers the TruthLens Flask Application, capable of real-time misinformation detection.


🧠 Model Details

Model Developer

  • Developed by: Divyanshu Chauhan
  • Specialization: Artificial Intelligence & Machine Learning
  • Project Purpose: Real-time misinformation detection using modern NLP

Model Type

  • Architecture: RoBERTa-base (Transformer)
  • Task: Binary Text Classification (Fake vs Real)
  • Framework: PyTorch + HuggingFace Transformers
  • Fine-tuned Model: This model

Model Capabilities

  • Detects whether a news text is REAL or FAKE
  • Provides a probability/confidence score
  • Works on short and long news text
  • Handles social media misinformation patterns

Training Details

βœ” Dataset

  • Name: WELFake
  • Samples: ~72,000
  • Balanced: Yes (after synthetic balancing)
  • Labels:
    • 0 = Fake
    • 1 = Real

Data Preprocessing Performed

  • Removed null entries
  • Merged title + text β†’ combined_text
  • Cleaned URLs, numbers, special characters
  • Lowercasing and stopword removal
  • Train/Test split
  • Used original text for Transformer fine-tuning (Transformers do their own tokenization)

Training Setup

  • Base Model: roberta-base
  • Epochs: 3
  • Max Length: 256 tokens
  • Batch Size: 8
  • Learning Rate: 2e-5
  • Optimizer: AdamW
  • Loss: Cross-Entropy
  • Hardware: GPU (Google Colab / Tesla T4)

Model Performance

Metric Score
Accuracy ~95%
F1 Score ~95%
Precision ~95%
Recall ~95%

This makes the model suitable for production-level applications.


Usage Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline

model_path = "divyanshu-chauhan-7786/fake-news-roberta"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)

clf = pipeline("text-classification", model=model, tokenizer=tokenizer)

text = "Breaking: Government announces new education reforms!"
result = clf(text, truncation=True, max_length=256)

print(result)
Downloads last month
52
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for divyanshu-chauhan-7786/fake-news-roberta

Finetuned
(2205)
this model