calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0695

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0686 1.0 5 2.3473
2.1497 2.0 10 1.8466
1.7093 3.0 15 1.4770
1.3516 4.0 20 1.1862
1.1256 5.0 25 1.0369
0.9968 6.0 30 0.9148
0.8740 7.0 35 0.7848
0.7769 8.0 40 0.7072
0.7493 9.0 45 0.6786
0.6867 10.0 50 0.6278
0.6252 11.0 55 0.5787
0.5738 12.0 60 0.5314
0.5339 13.0 65 0.4908
0.5060 14.0 70 0.4669
0.4701 15.0 75 0.4442
0.4516 16.0 80 0.4094
0.4198 17.0 85 0.3800
0.3867 18.0 90 0.3529
0.3603 19.0 95 0.3184
0.3273 20.0 100 0.2823
0.2989 21.0 105 0.2565
0.2673 22.0 110 0.2198
0.2490 23.0 115 0.2012
0.2227 24.0 120 0.1754
0.1998 25.0 125 0.1565
0.1802 26.0 130 0.1433
0.1650 27.0 135 0.1301
0.1544 28.0 140 0.1162
0.1430 29.0 145 0.1074
0.1336 30.0 150 0.0990
0.1267 31.0 155 0.0942
0.1185 32.0 160 0.0890
0.1142 33.0 165 0.0854
0.1076 34.0 170 0.0804
0.1030 35.0 175 0.0762
0.1002 36.0 180 0.0765
0.0973 37.0 185 0.0717
0.0969 38.0 190 0.0705
0.0951 39.0 195 0.0697
0.0933 40.0 200 0.0695

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
21
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support