DziriBERT for Algerian Darija Misinformation Detection

Model Description

Fine-tuned DziriBERT model for detecting misinformation in Algerian Darija text.

Base Model: alger-ia/dziribert

Task: Multi-class classification (5 classes)

Classes

F: Fake
R: Real
N: Non-new
M: Misleading
S: Satire

Performance

Metric	Score
Accuracy	77.27%
Macro F1	67.49%
Macro Precision	68.51%
Macro Recall	66.87%

Per-Class Performance

Class	Precision	Recall	F1-Score	Support
F	84.84%	84.66%	84.75%	952
R	78.13%	75.83%	76.96%	848
N	83.83%	84.40%	84.11%	872
M	59.40%	63.80%	61.53%	594
S	36.36%	25.64%	30.08%	78

Usage

# test_load_from_hub.py
import os

# CRITICAL: Disable TensorFlow before importing transformers
os.environ['USE_TF'] = '0'
os.environ['USE_TORCH'] = '1'

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load from HuggingFace Hub
REPO_ID = "aurelius2023/dziribert-algerian-misinformation"

print("Loading model from Hugging Face Hub...")
tokenizer = AutoTokenizer.from_pretrained(REPO_ID)
model = AutoModelForSequenceClassification.from_pretrained(REPO_ID)

print("✓ Model loaded successfully!")

# Test prediction
text = "وزير الشباب الجزائري يكشف ان الدول الاوروبيه تطلب من الجزائر حلولًا لمعالجه المشكلات الاجتماعيه لشبابها"
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True, padding=True)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    pred = torch.argmax(probs).item()
    confidence = probs[0][pred].item()

label_map = {0: 'F', 1: 'R', 2: 'N', 3: 'M', 4: 'S'}
label_names = {
    'F': 'Fake', 'R': 'Real', 'N': 'Non-new',
    'M': 'Misleading', 'S': 'Satire'
}

print(f"
Test Prediction:")
print(f"Text: {text}")
print(f"Predicted: {label_names[label_map[pred]]} ({label_map[pred]})")
print(f"Confidence: {confidence:.2%}")

Contact

For questions or issues, please open an issue on the model repository.

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Evaluation results

Macro F1
self-reported

0.675
Accuracy
self-reported

0.773