klue-bert-traffic-news-classifier
This is a fine-tuned klue/bert-base model for binary classification of Korean traffic accident news articles. It classifies whether a news article has a negative narrative about traffic accidents.
- Label 0: Non-negative / neutral article
- Label 1: Negative narrative article
The model is used to construct a monthly narrative index β the proportion of negative traffic accident news β which can be used as a signal variable in econometric analyses (e.g., regression with HAC standard errors).
Usage (HuggingFace Transformers)
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")
model = AutoModelForSequenceClassification.from_pretrained("donghyuni/klue-bert-traffic-news-classifier")
# Input format: title and body joined with [SEP]
title = "κ³ μλλ‘ λν μΆλ μ¬κ³ "
body = "κ³ μλλ‘μμ λν μΆλ μ¬κ³ κ° λ°μν΄ μμ λͺ
μ΄ λΆμμ μ
μλ€."
text = f"{title} [SEP] {body}"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
predicted_class = logits.argmax(dim=-1).item()
labels = {0: "non-negative", 1: "negative narrative"}
print(labels[predicted_class])
Training Details
| Item | Value |
|---|---|
| Base model | klue/bert-base |
| Task | Binary sequence classification |
| Training samples | 2,400 |
| Validation samples | 300 |
| Test samples | 300 |
| Epochs | 5 (best at epoch 4) |
| Batch size | 16 |
| Learning rate | 2e-5 |
| Max sequence length | 512 |
| Seed | 42 |
Evaluation Results (Test Set)
| Class | Precision | Recall | F1 |
|---|---|---|---|
| 0 (non-negative) | 0.9573 | 0.8745 | 0.9140 |
| 1 (negative narrative) | 0.6742 | 0.8696 | 0.7595 |
| Macro avg | 0.8158 | 0.8720 | 0.8368 |
| Weighted avg | 0.8922 | 0.8733 | 0.8785 |
- Test Accuracy: 0.8733
- Test F1 (negative class): 0.7595
Training Curve
| Epoch | Train Loss | Val Loss | Val F1 |
|---|---|---|---|
| 1 | 0.4587 | 0.3763 | 0.6984 |
| 2 | 0.3136 | 0.3966 | 0.6809 |
| 3 | 0.2333 | 0.4498 | 0.7037 |
| 4 | 0.1425 | 0.5398 | 0.7044 β best |
| 5 | 0.0863 | 0.6114 | 0.6892 |
Data
The model was trained on Korean news articles related to traffic accidents, sampled from the μ λ¬Έ λ§λμΉ corpus provided by the κ΅λ¦½κ΅μ΄μ λͺ¨λμλ§λμΉ. 3,000 articles were sampled and manually labeled by human annotators. The labels reflect whether the article conveys a negative connotation about traffic safety.
Citation
If you use this model, please cite:
@misc{klue-bert-traffic-news-classifier,
author = {Lee},
title = {klue-bert-traffic-news-classifier},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/your-username/klue-bert-traffic-news-classifier}}
}
- Downloads last month
- 1
Model tree for donghyuni/klue-bert-traffic-news-classifier
Base model
klue/bert-base