FiveC
/

ViTay-combine

text2text-generation

Generated from Trainer

Model card Files Files and versions

FiveC commited on Jan 2

Commit

5a96d9e

·

verified ·

1 Parent(s): bd07204

End of training

Files changed (1) hide show

README.md +8 -8

README.md CHANGED Viewed

@@ -18,8 +18,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [FiveC/BartTay](https://huggingface.co/FiveC/BartTay) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.1628
-- Sacrebleu: 19.4960
 ## Model description
@@ -42,7 +42,7 @@ The following hyperparameters were used during training:
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 3
 - mixed_precision_training: Native AMP
@@ -51,14 +51,14 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|
-| 0.2227        | 1.0   | 2569 | 0.1660          | 16.3996   |
-| 0.1493        | 2.0   | 5138 | 0.1605          | 18.7173   |
-| 0.1216        | 3.0   | 7707 | 0.1628          | 19.4960   |
 ### Framework versions
 - Transformers 4.57.3
-- Pytorch 2.9.0+cu126
-- Datasets 4.0.0
 - Tokenizers 0.22.1

 This model is a fine-tuned version of [FiveC/BartTay](https://huggingface.co/FiveC/BartTay) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1557
+- Sacrebleu: 20.5892
 ## Model description
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
+- optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 3
 - mixed_precision_training: Native AMP
 | Training Loss | Epoch | Step | Validation Loss | Sacrebleu |
 |:-------------:|:-----:|:----:|:---------------:|:---------:|
+| 0.1893        | 1.0   | 3209 | 0.1553          | 15.6992   |
+| 0.1188        | 2.0   | 6418 | 0.1503          | 19.2110   |
+| 0.0966        | 3.0   | 9627 | 0.1557          | 20.5892   |
 ### Framework versions
 - Transformers 4.57.3
+- Pytorch 2.9.1+cu128
+- Datasets 4.4.2
 - Tokenizers 0.22.1