secret-model-stage-1-4B-32
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2228
- Centroid Acc: 0.9623
- Centroid Macro F1: 0.9609
- Knn Acc: 0.9811
- Knn Macro F1: 0.9805
- Alignment: 0.4662
- Uniformity: -2.9420
- Combined Score: 0.9674
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 100.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Centroid Acc | Centroid Macro F1 | Knn Acc | Knn Macro F1 | Alignment | Uniformity | Combined Score |
|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.6091 | 0.5849 | 0.5621 | 0.6226 | 0.6153 | 0.4045 | -0.9519 | 0.5798 |
| 1.3289 | 3.125 | 100 | 0.9388 | 0.7925 | 0.7961 | 0.8679 | 0.8603 | 0.4598 | -2.0559 | 0.8175 |
| 1.1272 | 6.25 | 200 | 0.9040 | 0.9245 | 0.9206 | 0.9623 | 0.9634 | 0.3811 | -1.9263 | 0.9349 |
| 0.6096 | 9.375 | 300 | 0.4941 | 0.9245 | 0.9266 | 0.9623 | 0.9651 | 0.3902 | -2.2009 | 0.9394 |
| 0.2651 | 12.5 | 400 | 0.2747 | 0.9434 | 0.9380 | 0.9623 | 0.9612 | 0.4220 | -2.5885 | 0.9458 |
| 0.3661 | 15.625 | 500 | 0.2597 | 0.9623 | 0.9609 | 0.9623 | 0.9612 | 0.4173 | -2.5232 | 0.9610 |
| 0.2657 | 18.75 | 600 | 0.4085 | 0.9623 | 0.9609 | 0.9623 | 0.9609 | 0.4916 | -2.8044 | 0.9609 |
| 0.1856 | 21.875 | 700 | 0.1737 | 0.9623 | 0.9609 | 0.9434 | 0.9377 | 0.4633 | -2.8201 | 0.9532 |
| 0.0542 | 25.0 | 800 | 0.3073 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4630 | -2.7721 | 0.9674 |
| 0.0542 | 25.0 | 800 | 0.3073 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4630 | -2.7721 | 0.9674 |
| 0.0419 | 28.125 | 900 | 0.4531 | 0.9623 | 0.9609 | 0.9434 | 0.9414 | 0.5197 | -2.8544 | 0.9544 |
| 0.0267 | 31.25 | 1000 | 0.2948 | 0.9811 | 0.9805 | 0.9623 | 0.9609 | 0.4629 | -2.8576 | 0.9740 |
| 0.0577 | 34.375 | 1100 | 0.1827 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4662 | -2.7829 | 0.9805 |
| 0.0301 | 37.5 | 1200 | 0.2362 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4691 | -2.9109 | 0.9674 |
| 0.0117 | 40.625 | 1300 | 0.2933 | 0.9434 | 0.9414 | 0.9811 | 0.9805 | 0.4807 | -2.9007 | 0.9544 |
| 0.0085 | 43.75 | 1400 | 0.1404 | 0.9623 | 0.9609 | 0.9623 | 0.9609 | 0.4570 | -2.8762 | 0.9609 |
| 0.0188 | 46.875 | 1500 | 0.2016 | 0.9623 | 0.9609 | 0.9623 | 0.9609 | 0.4659 | -2.8799 | 0.9609 |
| 0.0015 | 50.0 | 1600 | 0.1932 | 0.9811 | 0.9805 | 0.9623 | 0.9609 | 0.4630 | -2.8989 | 0.9740 |
| 0.0015 | 50.0 | 1600 | 0.1932 | 0.9811 | 0.9805 | 0.9623 | 0.9609 | 0.4630 | -2.8989 | 0.9740 |
| 0.0559 | 53.125 | 1700 | 0.1840 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4485 | -2.8993 | 0.9805 |
| 0.001 | 56.25 | 1800 | 0.1984 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4641 | -2.9320 | 0.9805 |
| 0.022 | 59.375 | 1900 | 0.2263 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4727 | -2.9252 | 0.9674 |
| 0.0174 | 62.5 | 2000 | 0.2089 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4503 | -2.8793 | 0.9805 |
| 0.0012 | 65.625 | 2100 | 0.1445 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4540 | -2.9320 | 0.9805 |
| 0.0012 | 68.75 | 2200 | 0.1837 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4585 | -2.9431 | 0.9805 |
| 0.0013 | 71.875 | 2300 | 0.1845 | 0.9623 | 0.9609 | 0.9623 | 0.9609 | 0.4651 | -2.9513 | 0.9609 |
| 0.001 | 75.0 | 2400 | 0.1531 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4545 | -2.9333 | 0.9805 |
| 0.001 | 75.0 | 2400 | 0.1531 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4545 | -2.9333 | 0.9805 |
| 0.0008 | 78.125 | 2500 | 0.1951 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4601 | -2.9389 | 0.9805 |
| 0.0009 | 81.25 | 2600 | 0.1417 | 0.9811 | 0.9805 | 0.9811 | 0.9805 | 0.4511 | -2.9282 | 0.9805 |
| 0.0109 | 84.375 | 2700 | 0.2066 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4686 | -2.9577 | 0.9674 |
| 0.0009 | 87.5 | 2800 | 0.2112 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4651 | -2.9433 | 0.9674 |
| 0.0008 | 90.625 | 2900 | 0.2346 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4693 | -2.9437 | 0.9674 |
| 0.001 | 93.75 | 3000 | 0.2237 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4660 | -2.9412 | 0.9674 |
| 0.0296 | 96.875 | 3100 | 0.2244 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4663 | -2.9415 | 0.9674 |
| 0.0008 | 100.0 | 3200 | 0.2228 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4662 | -2.9420 | 0.9674 |
| 0.0008 | 100.0 | 3200 | 0.2228 | 0.9623 | 0.9609 | 0.9811 | 0.9805 | 0.4662 | -2.9420 | 0.9674 |
Framework versions
- Transformers 4.56.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support