kyLELEng
/

bert-ag-news-classifier

Text Classification

Generated from Trainer

text-embeddings-inference

Model card Files Files and versions

Best jy commited on 26 days ago

Commit

09aa324

·

verified ·

1 Parent(s): ae2279b

Update README.md

Files changed (1) hide show

README.md +62 -2

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 # bert-ag-news-classifier
-This model is a fine-tuned version of [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.2339
 - Accuracy: 0.9461
@@ -34,7 +34,67 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure

 # bert-ag-news-classifier
+This model is a fine-tuned version of `google-bert/bert-base-uncased` on the [`fancyzhx/ag_news`](https://huggingface.co/datasets/fancyzhx/ag_news) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.2339
 - Accuracy: 0.9461
 ## Training and evaluation data
+Source dataset: `fancyzhx/ag_news`.
+AG News is an English news topic classification dataset with four labels:
+- `0`: World
+- `1`: Sports
+- `2`: Business
+- `3`: Sci/Tech
+The original dataset provides an official training split and an official test split.
+Data split used in this project:
+| Split | Source | Size | Purpose |
+|---|---:|---:|---|
+| Train | 90% of official training split | 108,000 | Model fine-tuning |
+| Validation | 10% of official training split | 12,000 | Checkpoint selection |
+| Test | Official test split | 7,600 | Final evaluation |
+The train/validation split was stratified by label, so each class remains balanced:
+| Split | World | Sports | Business | Sci/Tech |
+|---|---:|---:|---:|---:|
+| Train | 27,000 | 27,000 | 27,000 | 27,000 |
+| Validation | 3,000 | 3,000 | 3,000 | 3,000 |
+| Test | 1,900 | 1,900 | 1,900 | 1,900 |
+Text preprocessing was intentionally light:
+- Leading and trailing whitespace was removed.
+- Repeated whitespace was collapsed into a single space.
+- Punctuation was kept.
+- No manual lowercasing was applied beyond the behavior of `google-bert/bert-base-uncased`.
+The official test split was not used during training or checkpoint selection. The best checkpoint was selected using validation macro F1.
+Final evaluation on the official test split:
+| Metric | Value |
+|---|---:|
+| Accuracy | 0.9461 |
+| Macro precision | 0.9461 |
+| Macro recall | 0.9461 |
+| Macro F1 | 0.9461 |
+Per-class test performance:
+| Class | Precision | Recall | F1 | Support |
+|---|---:|---:|---:|---:|
+| World | 0.9603 | 0.9547 | 0.9575 | 1,900 |
+| Sports | 0.9884 | 0.9879 | 0.9882 | 1,900 |
+| Business | 0.9203 | 0.9116 | 0.9159 | 1,900 |
+| Sci/Tech | 0.9155 | 0.9300 | 0.9227 | 1,900 |
+The confusion matrix and error samples are included in this repository:
+- `confusion_matrix.csv`
+- `error_analysis.csv`
+The main confusion patterns are between `Business` and `Sci/Tech`, which is expected because technology-company news, product launches, and market-related technology stories often overlap.
 ## Training procedure