secret-model-stage-1-0.6B-512

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9233
  • Centroid Acc: 0.8679
  • Centroid Macro F1: 0.8730
  • Knn Acc: 0.8491
  • Knn Macro F1: 0.8563
  • Alignment: 0.6904
  • Uniformity: -3.0335
  • Combined Score: 0.8674

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Centroid Acc Centroid Macro F1 Knn Acc Knn Macro F1 Alignment Uniformity Combined Score
No log 0 0 2.3522 0.6604 0.6607 0.8679 0.8627 0.3334 -0.7995 0.7280
1.5883 3.125 100 1.3699 0.7736 0.7708 0.8679 0.8755 0.6157 -2.0089 0.8057
1.3499 6.25 200 1.0940 0.9245 0.9226 0.8679 0.8721 0.5183 -2.0487 0.9058
1.0313 9.375 300 1.0252 0.8679 0.8690 0.8679 0.8663 0.4744 -1.9845 0.8681
0.527 12.5 400 0.8431 0.7925 0.7763 0.8302 0.8012 0.5916 -2.6024 0.7846
0.566 15.625 500 0.8905 0.8491 0.8572 0.8491 0.8440 0.5574 -2.4475 0.8528
0.5135 18.75 600 0.8551 0.8302 0.8318 0.8679 0.8672 0.5690 -2.5328 0.8436
0.3032 21.875 700 0.9638 0.8491 0.8516 0.8302 0.8352 0.6250 -2.6975 0.8461
0.2311 25.0 800 0.9164 0.8302 0.8323 0.8113 0.8084 0.6480 -2.8651 0.8243
0.2311 25.0 800 0.9164 0.8302 0.8323 0.8113 0.8084 0.6480 -2.8651 0.8243
0.226 28.125 900 1.0117 0.9057 0.9097 0.8491 0.8534 0.6128 -2.7859 0.8909
0.1224 31.25 1000 0.8148 0.8679 0.8682 0.8679 0.8655 0.6621 -2.9576 0.8673
0.0862 34.375 1100 1.0351 0.8679 0.8720 0.8491 0.8403 0.6816 -2.9616 0.8614
0.0973 37.5 1200 1.1075 0.8113 0.8060 0.8302 0.8223 0.6882 -2.9441 0.8115
0.0396 40.625 1300 0.8058 0.8491 0.8527 0.8679 0.8690 0.6195 -2.8024 0.8582
0.0139 43.75 1400 0.8962 0.8679 0.8690 0.8679 0.8726 0.6721 -2.9618 0.8702
0.0278 46.875 1500 0.9030 0.8679 0.8690 0.8679 0.8646 0.6892 -3.0348 0.8675
0.0158 50.0 1600 0.8332 0.8302 0.8371 0.8679 0.8726 0.6733 -2.9483 0.8489
0.0158 50.0 1600 0.8332 0.8302 0.8371 0.8679 0.8726 0.6733 -2.9483 0.8489
0.0787 53.125 1700 0.8769 0.8491 0.8527 0.8679 0.8690 0.6660 -2.9786 0.8582
0.0061 56.25 1800 0.9462 0.8491 0.8527 0.8491 0.8563 0.6764 -2.9874 0.8539
0.0203 59.375 1900 0.9591 0.8113 0.8060 0.8491 0.8527 0.6813 -2.9815 0.8216
0.0286 62.5 2000 0.8517 0.8491 0.8527 0.8679 0.8755 0.6959 -3.0418 0.8603
0.005 65.625 2100 0.8745 0.8679 0.8730 0.8679 0.8755 0.6853 -3.0349 0.8738
0.0035 68.75 2200 0.8911 0.8679 0.8730 0.8491 0.8563 0.6857 -3.0217 0.8674
0.0038 71.875 2300 0.9111 0.8679 0.8730 0.8679 0.8755 0.6848 -3.0124 0.8738
0.0041 75.0 2400 0.8897 0.8491 0.8510 0.8679 0.8755 0.6847 -3.0196 0.8592
0.0041 75.0 2400 0.8897 0.8491 0.8510 0.8679 0.8755 0.6847 -3.0196 0.8592
0.0051 78.125 2500 0.9212 0.8679 0.8730 0.8491 0.8563 0.6903 -3.0364 0.8674
0.0037 81.25 2600 0.9200 0.8679 0.8730 0.8679 0.8755 0.6847 -3.0164 0.8738
0.0228 84.375 2700 0.9245 0.8679 0.8730 0.8491 0.8563 0.6801 -3.0048 0.8674
0.0027 87.5 2800 0.9212 0.8679 0.8730 0.8491 0.8563 0.6883 -3.0251 0.8674
0.004 90.625 2900 0.9288 0.8679 0.8730 0.8491 0.8563 0.6930 -3.0377 0.8674
0.0032 93.75 3000 0.9245 0.8679 0.8730 0.8491 0.8563 0.6914 -3.0362 0.8674
0.0322 96.875 3100 0.9249 0.8679 0.8730 0.8491 0.8563 0.6907 -3.0344 0.8674
0.002 100.0 3200 0.9233 0.8679 0.8730 0.8491 0.8563 0.6904 -3.0335 0.8674
0.002 100.0 3200 0.9233 0.8679 0.8730 0.8491 0.8563 0.6904 -3.0335 0.8674

Framework versions

  • Transformers 4.56.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
525k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support