klue-bert-traffic-news-classifier

This is a fine-tuned klue/bert-base model for binary classification of Korean traffic accident news articles. It classifies whether a news article has a negative narrative about traffic accidents.

  • Label 0: Non-negative / neutral article
  • Label 1: Negative narrative article

The model is used to construct a monthly narrative index β€” the proportion of negative traffic accident news β€” which can be used as a signal variable in econometric analyses (e.g., regression with HAC standard errors).


Usage (HuggingFace Transformers)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")
model = AutoModelForSequenceClassification.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")

# Input format: title and body joined with [SEP]
title = "κ³ μ†λ„λ‘œ λŒ€ν˜• μΆ”λŒ 사고"
body = "κ³ μ†λ„λ‘œμ—μ„œ λŒ€ν˜• μΆ”λŒ 사고가 λ°œμƒν•΄ μˆ˜μ‹­ λͺ…이 뢀상을 μž…μ—ˆλ‹€."
text = f"{title} [SEP] {body}"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class = logits.argmax(dim=-1).item()
labels = {0: "non-negative", 1: "negative narrative"}
print(labels[predicted_class])

Training Details

Item Value
Base model klue/bert-base
Task Binary sequence classification
Training samples 2,400
Validation samples 300
Test samples 300
Epochs 5 (best at epoch 4)
Batch size 16
Learning rate 2e-5
Max sequence length 512
Seed 42

Evaluation Results (Test Set)

Class Precision Recall F1
0 (non-negative) 0.9573 0.8745 0.9140
1 (negative narrative) 0.6742 0.8696 0.7595
Macro avg 0.8158 0.8720 0.8368
Weighted avg 0.8922 0.8733 0.8785
  • Test Accuracy: 0.8733
  • Test F1 (negative class): 0.7595

Training Curve

Epoch Train Loss Val Loss Val F1
1 0.4587 0.3763 0.6984
2 0.3136 0.3966 0.6809
3 0.2333 0.4498 0.7037
4 0.1425 0.5398 0.7044 ← best
5 0.0863 0.6114 0.6892

Data

The model was trained on Korean news articles related to traffic accidents, sampled from the μ‹ λ¬Έ λ§λ­‰μΉ˜ corpus provided by the ꡭ립ꡭ어원 λͺ¨λ‘μ˜λ§λ­‰μΉ˜. 3,000 articles were sampled and manually labeled by human annotators. The labels reflect whether the article conveys a negative connotation about traffic safety.


Citation

If you use this model, please cite:

@misc{klue-bert-traffic-news-classifier,
  author    = {Lee},
  title     = {klue-bert-traffic-news-classifier},
  year      = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/your-username/klue-bert-traffic-news-classifier}}
}
Downloads last month
1
Safetensors
Model size
0.1B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for donghyuni/klue-bert-traffic-news-classifier

Base model

klue/bert-base
Finetuned
(163)
this model