Plainly Optimized Network
Dataset: SUPERGLUE
Trainer Hyperparameters:
lr= 5e-05per_device_batch_size= 8gradient_accumulation_steps= 2weight_decay= 1e-09seed= 42
| eval_loss | eval_accuracy | epoch |
|---|---|---|
| 19.092 | 0.667 | 1.0 |
| 18.211 | 0.667 | 2.0 |
| 17.359 | 0.739 | 3.0 |
| 17.168 | 0.732 | 4.0 |
| 18.647 | 0.681 | 5.0 |
| 18.081 | 0.681 | 6.0 |
| 18.325 | 0.688 | 7.0 |
| 18.660 | 0.688 | 8.0 |
| 18.464 | 0.688 | 9.0 |
| 18.622 | 0.696 | 10.0 |
| 17.838 | 0.710 | 11.0 |
| 17.792 | 0.703 | 12.0 |
| 18.009 | 0.696 | 13.0 |
| 19.033 | 0.674 | 14.0 |
| 17.430 | 0.717 | 15.0 |
| 18.218 | 0.696 | 16.0 |
| 17.915 | 0.710 | 17.0 |
| 17.956 | 0.717 | 18.0 |
| 18.078 | 0.725 | 19.0 |
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support