Plainly Optimized Network

Dataset: SUPERGLUE

Trainer Hyperparameters:

  • lr = 5e-05
  • per_device_batch_size = 8
  • gradient_accumulation_steps = 2
  • weight_decay = 1e-09
  • seed = 42
eval_loss eval_accuracy epoch
22.014 0.471 1.0
19.411 0.659 2.0
18.711 0.696 3.0
19.141 0.652 4.0
19.924 0.638 5.0
19.229 0.652 6.0
20.306 0.623 7.0
19.739 0.645 8.0
20.873 0.623 9.0
20.285 0.638 10.0
18.900 0.696 11.0
18.971 0.681 12.0
19.230 0.667 13.0
19.039 0.674 14.0
19.080 0.667 15.0
18.997 0.681 16.0
18.619 0.681 17.0
18.754 0.681 18.0
18.911 0.674 19.0
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support