bert-ag-news-classifier
This model is a fine-tuned version of google-bert/bert-base-uncased on the fancyzhx/ag_news dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2339
- Accuracy: 0.9461
- Precision Macro: 0.9461
- Recall Macro: 0.9461
- F1 Macro: 0.9461
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
Source dataset: fancyzhx/ag_news.
AG News is an English news topic classification dataset with four labels:
0: World1: Sports2: Business3: Sci/Tech
The original dataset provides an official training split and an official test split.
Data split used in this project:
| Split | Source | Size | Purpose |
|---|---|---|---|
| Train | 90% of official training split | 108,000 | Model fine-tuning |
| Validation | 10% of official training split | 12,000 | Checkpoint selection |
| Test | Official test split | 7,600 | Final evaluation |
The train/validation split was stratified by label, so each class remains balanced:
| Split | World | Sports | Business | Sci/Tech |
|---|---|---|---|---|
| Train | 27,000 | 27,000 | 27,000 | 27,000 |
| Validation | 3,000 | 3,000 | 3,000 | 3,000 |
| Test | 1,900 | 1,900 | 1,900 | 1,900 |
Text preprocessing was intentionally light:
- Leading and trailing whitespace was removed.
- Repeated whitespace was collapsed into a single space.
- Punctuation was kept.
- No manual lowercasing was applied beyond the behavior of
google-bert/bert-base-uncased.
The official test split was not used during training or checkpoint selection. The best checkpoint was selected using validation macro F1.
Final evaluation on the official test split:
| Metric | Value |
|---|---|
| Accuracy | 0.9461 |
| Macro precision | 0.9461 |
| Macro recall | 0.9461 |
| Macro F1 | 0.9461 |
Per-class test performance:
| Class | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| World | 0.9603 | 0.9547 | 0.9575 | 1,900 |
| Sports | 0.9884 | 0.9879 | 0.9882 | 1,900 |
| Business | 0.9203 | 0.9116 | 0.9159 | 1,900 |
| Sci/Tech | 0.9155 | 0.9300 | 0.9227 | 1,900 |
The confusion matrix and error samples are included in this repository:
confusion_matrix.csverror_analysis.csv
The main confusion patterns are between Business and Sci/Tech, which is expected because technology-company news, product launches, and market-related technology stories often overlap.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 0.1
- num_epochs: 3.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision Macro | Recall Macro | F1 Macro |
|---|---|---|---|---|---|---|---|
| 0.1963 | 1.0 | 6750 | 0.1911 | 0.9413 | 0.9417 | 0.9412 | 0.9414 |
| 0.1206 | 2.0 | 13500 | 0.2082 | 0.9451 | 0.9460 | 0.9451 | 0.9451 |
| 0.1125 | 3.0 | 20250 | 0.2336 | 0.9453 | 0.9456 | 0.9453 | 0.9454 |
Framework versions
- Transformers 5.6.2
- Pytorch 2.11.0+cu130
- Datasets 4.8.4
- Tokenizers 0.22.2
- Downloads last month
- 150
Model tree for kyLELEng/bert-ag-news-classifier
Base model
google-bert/bert-base-uncased