klue-bert-traffic-news-classifier

This is a fine-tuned klue/bert-base model for binary classification of Korean traffic accident news articles. It classifies whether a news article has a negative narrative about traffic accidents.

Label 0: Non-negative / neutral article
Label 1: Negative narrative article

The model is used to construct a monthly narrative index — the proportion of negative traffic accident news — which can be used as a signal variable in econometric analyses (e.g., regression with HAC standard errors).

Usage (HuggingFace Transformers)

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")
model = AutoModelForSequenceClassification.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")

# Input format: title and body joined with [SEP]
title = "고속도로 대형 추돌 사고"
body = "고속도로에서 대형 추돌 사고가 발생해 수십 명이 부상을 입었다."
text = f"{title} [SEP] {body}"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class = logits.argmax(dim=-1).item()
labels = {0: "non-negative", 1: "negative narrative"}
print(labels[predicted_class])

Training Details

Item	Value
Base model	`klue/bert-base`
Task	Binary sequence classification
Training samples	2,400
Validation samples	300
Test samples	300
Epochs	5 (best at epoch 4)
Batch size	16
Learning rate	2e-5
Max sequence length	512
Seed	42

Evaluation Results (Test Set)

Class	Precision	Recall	F1
0 (non-negative)	0.9573	0.8745	0.9140
1 (negative narrative)	0.6742	0.8696	0.7595
Macro avg	0.8158	0.8720	0.8368
Weighted avg	0.8922	0.8733	0.8785

Test Accuracy: 0.8733
Test F1 (negative class): 0.7595

Training Curve

Epoch	Train Loss	Val Loss	Val F1
1	0.4587	0.3763	0.6984
2	0.3136	0.3966	0.6809
3	0.2333	0.4498	0.7037
4	0.1425	0.5398	0.7044 ← best
5	0.0863	0.6114	0.6892

Data

The model was trained on Korean news articles related to traffic accidents, sampled from the 신문 말뭉치 corpus provided by the 국립국어원 모두의말뭉치. 3,000 articles were sampled and manually labeled by human annotators. The labels reflect whether the article conveys a negative connotation about traffic safety.

Citation

If you use this model, please cite:

@misc{klue-bert-traffic-news-classifier,
  author    = {Lee},
  title     = {klue-bert-traffic-news-classifier},
  year      = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/your-username/klue-bert-traffic-news-classifier}}
}

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for donghyuni/klue-bert-traffic-news-classifier

Base model

klue/bert-base

Finetuned

(163)

this model