secret-model-stage-1-4B-32

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2228
  • Centroid Acc: 0.9623
  • Centroid Macro F1: 0.9609
  • Knn Acc: 0.9811
  • Knn Macro F1: 0.9805
  • Alignment: 0.4662
  • Uniformity: -2.9420
  • Combined Score: 0.9674

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 100.0

Training results

Training Loss Epoch Step Validation Loss Centroid Acc Centroid Macro F1 Knn Acc Knn Macro F1 Alignment Uniformity Combined Score
No log 0 0 2.6091 0.5849 0.5621 0.6226 0.6153 0.4045 -0.9519 0.5798
1.3289 3.125 100 0.9388 0.7925 0.7961 0.8679 0.8603 0.4598 -2.0559 0.8175
1.1272 6.25 200 0.9040 0.9245 0.9206 0.9623 0.9634 0.3811 -1.9263 0.9349
0.6096 9.375 300 0.4941 0.9245 0.9266 0.9623 0.9651 0.3902 -2.2009 0.9394
0.2651 12.5 400 0.2747 0.9434 0.9380 0.9623 0.9612 0.4220 -2.5885 0.9458
0.3661 15.625 500 0.2597 0.9623 0.9609 0.9623 0.9612 0.4173 -2.5232 0.9610
0.2657 18.75 600 0.4085 0.9623 0.9609 0.9623 0.9609 0.4916 -2.8044 0.9609
0.1856 21.875 700 0.1737 0.9623 0.9609 0.9434 0.9377 0.4633 -2.8201 0.9532
0.0542 25.0 800 0.3073 0.9623 0.9609 0.9811 0.9805 0.4630 -2.7721 0.9674
0.0542 25.0 800 0.3073 0.9623 0.9609 0.9811 0.9805 0.4630 -2.7721 0.9674
0.0419 28.125 900 0.4531 0.9623 0.9609 0.9434 0.9414 0.5197 -2.8544 0.9544
0.0267 31.25 1000 0.2948 0.9811 0.9805 0.9623 0.9609 0.4629 -2.8576 0.9740
0.0577 34.375 1100 0.1827 0.9811 0.9805 0.9811 0.9805 0.4662 -2.7829 0.9805
0.0301 37.5 1200 0.2362 0.9623 0.9609 0.9811 0.9805 0.4691 -2.9109 0.9674
0.0117 40.625 1300 0.2933 0.9434 0.9414 0.9811 0.9805 0.4807 -2.9007 0.9544
0.0085 43.75 1400 0.1404 0.9623 0.9609 0.9623 0.9609 0.4570 -2.8762 0.9609
0.0188 46.875 1500 0.2016 0.9623 0.9609 0.9623 0.9609 0.4659 -2.8799 0.9609
0.0015 50.0 1600 0.1932 0.9811 0.9805 0.9623 0.9609 0.4630 -2.8989 0.9740
0.0015 50.0 1600 0.1932 0.9811 0.9805 0.9623 0.9609 0.4630 -2.8989 0.9740
0.0559 53.125 1700 0.1840 0.9811 0.9805 0.9811 0.9805 0.4485 -2.8993 0.9805
0.001 56.25 1800 0.1984 0.9811 0.9805 0.9811 0.9805 0.4641 -2.9320 0.9805
0.022 59.375 1900 0.2263 0.9623 0.9609 0.9811 0.9805 0.4727 -2.9252 0.9674
0.0174 62.5 2000 0.2089 0.9811 0.9805 0.9811 0.9805 0.4503 -2.8793 0.9805
0.0012 65.625 2100 0.1445 0.9811 0.9805 0.9811 0.9805 0.4540 -2.9320 0.9805
0.0012 68.75 2200 0.1837 0.9811 0.9805 0.9811 0.9805 0.4585 -2.9431 0.9805
0.0013 71.875 2300 0.1845 0.9623 0.9609 0.9623 0.9609 0.4651 -2.9513 0.9609
0.001 75.0 2400 0.1531 0.9811 0.9805 0.9811 0.9805 0.4545 -2.9333 0.9805
0.001 75.0 2400 0.1531 0.9811 0.9805 0.9811 0.9805 0.4545 -2.9333 0.9805
0.0008 78.125 2500 0.1951 0.9811 0.9805 0.9811 0.9805 0.4601 -2.9389 0.9805
0.0009 81.25 2600 0.1417 0.9811 0.9805 0.9811 0.9805 0.4511 -2.9282 0.9805
0.0109 84.375 2700 0.2066 0.9623 0.9609 0.9811 0.9805 0.4686 -2.9577 0.9674
0.0009 87.5 2800 0.2112 0.9623 0.9609 0.9811 0.9805 0.4651 -2.9433 0.9674
0.0008 90.625 2900 0.2346 0.9623 0.9609 0.9811 0.9805 0.4693 -2.9437 0.9674
0.001 93.75 3000 0.2237 0.9623 0.9609 0.9811 0.9805 0.4660 -2.9412 0.9674
0.0296 96.875 3100 0.2244 0.9623 0.9609 0.9811 0.9805 0.4663 -2.9415 0.9674
0.0008 100.0 3200 0.2228 0.9623 0.9609 0.9811 0.9805 0.4662 -2.9420 0.9674
0.0008 100.0 3200 0.2228 0.9623 0.9609 0.9811 0.9805 0.4662 -2.9420 0.9674

Framework versions

  • Transformers 4.56.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
82k params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support