political-lean-classifier (best_bias_model)

Fine-tuned roberta-base model for political bias score prediction from news text.

This checkpoint is trained as a regression model (num_labels=1) and outputs a single continuous score. In this project, target labels are derived from the bias field of the dataset and commonly fall in the 0 to 4 range, indicating leaning from "extreme left" to "extreme right".

Model Details

  • Architecture: RobertaForSequenceClassification
  • Base model: roberta-base
  • Task type: text regression
  • Max sequence length: 512
  • Language: English

Training Data

  • Dataset: pietrolesci/hyperpartisan_news_detection
  • Split used: sampled subset of train split
  • Rows used in this project: 50,000
  • Input text: concatenated article title + cleaned article body
  • Label: numeric bias score (bias)

Intended Use

Use this model to estimate political bias tendency of English news text at a document level.

Potential use cases:

  • Media analysis dashboards
  • Content trend analysis
  • Research experiments on bias scoring

This model is not intended to be used as the sole basis for moderation, ranking, or policy decisions.

Quick Start

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "zhezhou1106/political-leaning-classifier"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

text = "Tax cuts for corporations will result in increased economic activity."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()

print("Predicted bias score:", score)

Training Procedure

Key training settings from this project:

  • Learning rate: 2e-5
  • Epochs: 3
  • Train batch size: 16
  • Eval batch size: 16
  • Weight decay: 0.01
  • Evaluation strategy: every 100 steps
  • Checkpoint save strategy: every 1000 steps
  • Best model criterion: lowest MSE

Evaluation

The following evaluation artifacts are generated in this repository and included in the model card.

Training Metrics Curves

Training Loss Validation Loss

Label vs Prediction Distributions

Label Distribution Prediction Distribution Label vs Prediction Overlay

Downloads last month
63
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zhezhou1106/political-leaning-classifier

Finetuned
(2204)
this model

Dataset used to train zhezhou1106/political-leaning-classifier