| --- |
| license: apache-2.0 |
| base_model: t5-small |
| tags: |
| - generated_from_trainer |
| metrics: |
| - rouge |
| model-index: |
| - name: summarizer-billsum_dataset |
| results: [] |
| --- |
| |
| <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| should probably proofread and complete it, then remove this comment. --> |
|
|
| # summarizer-billsum_dataset |
| |
| This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset. |
| It achieves the following results on the evaluation set: |
| - Loss: 2.4835 |
| - Rouge1: 0.1837 |
| - Rouge2: 0.0818 |
| - Rougel: 0.1536 |
| - Rougelsum: 0.154 |
| - Gen Len: 19.0 |
| |
| ## Model description |
| |
| More information needed |
| |
| ## Intended uses & limitations |
| |
| More information needed |
| |
| ## Training and evaluation data |
| |
| More information needed |
| |
| ## Training procedure |
| |
| ### Training hyperparameters |
| |
| The following hyperparameters were used during training: |
| - learning_rate: 2e-05 |
| - train_batch_size: 16 |
| - eval_batch_size: 16 |
| - seed: 42 |
| - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
| - lr_scheduler_type: linear |
| - num_epochs: 20 |
| - mixed_precision_training: Native AMP |
| |
| ### Training results |
| |
| | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |
| |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:| |
| | No log | 1.0 | 25 | 3.4284 | 0.1297 | 0.0383 | 0.109 | 0.1089 | 19.0 | |
| | No log | 2.0 | 50 | 3.0057 | 0.1222 | 0.0351 | 0.1031 | 0.1029 | 19.0 | |
| | No log | 3.0 | 75 | 2.8213 | 0.1242 | 0.0376 | 0.1042 | 0.1041 | 19.0 | |
| | No log | 4.0 | 100 | 2.7231 | 0.1283 | 0.0401 | 0.105 | 0.105 | 19.0 | |
| | No log | 5.0 | 125 | 2.6706 | 0.1371 | 0.049 | 0.1122 | 0.1122 | 19.0 | |
| | No log | 6.0 | 150 | 2.6307 | 0.1373 | 0.0473 | 0.1129 | 0.1128 | 19.0 | |
| | No log | 7.0 | 175 | 2.5988 | 0.1408 | 0.0496 | 0.1149 | 0.1148 | 19.0 | |
| | No log | 8.0 | 200 | 2.5731 | 0.1471 | 0.0509 | 0.1209 | 0.1212 | 19.0 | |
| | No log | 9.0 | 225 | 2.5557 | 0.156 | 0.0584 | 0.1293 | 0.1296 | 19.0 | |
| | No log | 10.0 | 250 | 2.5382 | 0.1642 | 0.0656 | 0.1357 | 0.1356 | 19.0 | |
| | No log | 11.0 | 275 | 2.5262 | 0.1695 | 0.0716 | 0.1402 | 0.1403 | 19.0 | |
| | No log | 12.0 | 300 | 2.5173 | 0.1773 | 0.0778 | 0.1475 | 0.1475 | 19.0 | |
| | No log | 13.0 | 325 | 2.5089 | 0.18 | 0.0801 | 0.1493 | 0.1496 | 19.0 | |
| | No log | 14.0 | 350 | 2.5013 | 0.1821 | 0.08 | 0.1515 | 0.1516 | 19.0 | |
| | No log | 15.0 | 375 | 2.4954 | 0.1823 | 0.0801 | 0.1527 | 0.1528 | 19.0 | |
| | No log | 16.0 | 400 | 2.4910 | 0.1832 | 0.0808 | 0.1532 | 0.1534 | 19.0 | |
| | No log | 17.0 | 425 | 2.4875 | 0.1842 | 0.082 | 0.154 | 0.1543 | 19.0 | |
| | No log | 18.0 | 450 | 2.4849 | 0.1841 | 0.0818 | 0.1539 | 0.1541 | 19.0 | |
| | No log | 19.0 | 475 | 2.4840 | 0.1837 | 0.0818 | 0.1536 | 0.154 | 19.0 | |
| | 2.7815 | 20.0 | 500 | 2.4835 | 0.1837 | 0.0818 | 0.1536 | 0.154 | 19.0 | |
| |
| |
| ### Framework versions |
| |
| - Transformers 4.35.2 |
| - Pytorch 2.1.0+cu121 |
| - Datasets 2.17.0 |
| - Tokenizers 0.15.1 |
| |