Plainly Optimized Network
Dataset: SUPERGLUE
Trainer Hyperparameters:
lr= 5e-05per_device_batch_size= 8gradient_accumulation_steps= 2weight_decay= 1e-09seed= 42
| eval_loss | eval_accuracy | epoch |
|---|---|---|
| 22.014 | 0.471 | 1.0 |
| 19.411 | 0.659 | 2.0 |
| 18.711 | 0.696 | 3.0 |
| 19.141 | 0.652 | 4.0 |
| 19.924 | 0.638 | 5.0 |
| 19.229 | 0.652 | 6.0 |
| 20.306 | 0.623 | 7.0 |
| 19.739 | 0.645 | 8.0 |
| 20.873 | 0.623 | 9.0 |
| 20.285 | 0.638 | 10.0 |
| 18.900 | 0.696 | 11.0 |
| 18.971 | 0.681 | 12.0 |
| 19.230 | 0.667 | 13.0 |
| 19.039 | 0.674 | 14.0 |
| 19.080 | 0.667 | 15.0 |
| 18.997 | 0.681 | 16.0 |
| 18.619 | 0.681 | 17.0 |
| 18.754 | 0.681 | 18.0 |
| 18.911 | 0.674 | 19.0 |
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support