ij6lz8iv_20250703_005007
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8409
- Model Preparation Time: 0.0077
- Move Accuracy: 0.1373
- Token Accuracy: 0.6813
- Accuracy: 0.1373
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 128
- eval_batch_size: 256
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Move Accuracy | Token Accuracy | Accuracy |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.9474 | 0.0077 | 0.0 | 0.0000 | 0.0 |
| 2.5096 | 0.0098 | 100 | 2.4564 | 0.0077 | 0.0 | 0.2960 | 0.0 |
| 1.629 | 0.0196 | 200 | 1.6088 | 0.0077 | 0.0010 | 0.3821 | 0.0010 |
| 1.4227 | 0.0295 | 300 | 1.4684 | 0.0077 | 0.0075 | 0.4341 | 0.0075 |
| 1.4087 | 0.0393 | 400 | 1.4635 | 0.0077 | 0.0033 | 0.4369 | 0.0033 |
| 1.4064 | 0.0491 | 500 | 1.4361 | 0.0077 | 0.0037 | 0.4405 | 0.0037 |
| 1.3376 | 0.0589 | 600 | 1.3867 | 0.0077 | 0.0113 | 0.4673 | 0.0113 |
| 1.3273 | 0.0687 | 700 | 1.3559 | 0.0077 | 0.0156 | 0.4792 | 0.0156 |
| 1.3902 | 0.0785 | 800 | 1.3272 | 0.0077 | 0.0210 | 0.4934 | 0.0210 |
| 1.3093 | 0.0884 | 900 | 1.2895 | 0.0077 | 0.0353 | 0.5078 | 0.0353 |
| 1.2923 | 0.0982 | 1000 | 1.2848 | 0.0077 | 0.0331 | 0.5097 | 0.0331 |
| 1.2313 | 0.1080 | 1100 | 1.2555 | 0.0077 | 0.0382 | 0.5183 | 0.0382 |
| 1.1623 | 0.1178 | 1200 | 1.2285 | 0.0077 | 0.0430 | 0.5295 | 0.0430 |
| 1.207 | 0.1276 | 1300 | 1.2233 | 0.0077 | 0.0429 | 0.5311 | 0.0429 |
| 1.2216 | 0.1374 | 1400 | 1.1983 | 0.0077 | 0.0458 | 0.5424 | 0.0458 |
| 1.1246 | 0.1473 | 1500 | 1.1767 | 0.0077 | 0.0543 | 0.5515 | 0.0543 |
| 1.1268 | 0.1571 | 1600 | 1.1412 | 0.0077 | 0.0575 | 0.5639 | 0.0575 |
| 1.1628 | 0.1669 | 1700 | 1.1222 | 0.0077 | 0.0542 | 0.5696 | 0.0542 |
| 1.1146 | 0.1767 | 1800 | 1.1067 | 0.0077 | 0.0587 | 0.5738 | 0.0587 |
| 1.0298 | 0.1865 | 1900 | 1.0852 | 0.0077 | 0.0669 | 0.5843 | 0.0669 |
| 1.0159 | 0.1963 | 2000 | 1.0702 | 0.0077 | 0.0659 | 0.5897 | 0.0659 |
| 1.0201 | 0.2062 | 2100 | 1.0468 | 0.0077 | 0.0667 | 0.5963 | 0.0667 |
| 1.0362 | 0.2160 | 2200 | 1.0304 | 0.0077 | 0.0719 | 0.6053 | 0.0719 |
| 1.086 | 0.2258 | 2300 | 1.0263 | 0.0077 | 0.0766 | 0.6090 | 0.0766 |
| 0.9216 | 0.2356 | 2400 | 0.9943 | 0.0077 | 0.0856 | 0.6203 | 0.0856 |
| 0.9637 | 0.2454 | 2500 | 0.9795 | 0.0077 | 0.0881 | 0.6285 | 0.0881 |
| 0.9815 | 0.2553 | 2600 | 0.9631 | 0.0077 | 0.0966 | 0.6334 | 0.0966 |
| 0.989 | 0.2651 | 2700 | 0.9537 | 0.0077 | 0.0953 | 0.6350 | 0.0953 |
| 0.9309 | 0.2749 | 2800 | 0.9459 | 0.0077 | 0.0944 | 0.6392 | 0.0944 |
| 0.8757 | 0.2847 | 2900 | 0.9335 | 0.0077 | 0.1034 | 0.6463 | 0.1034 |
| 0.9179 | 0.2945 | 3000 | 0.9261 | 0.0077 | 0.1020 | 0.6466 | 0.1020 |
| 0.8935 | 0.3043 | 3100 | 0.9172 | 0.0077 | 0.1060 | 0.6551 | 0.1060 |
| 0.8617 | 0.3142 | 3200 | 0.9008 | 0.0077 | 0.1081 | 0.6562 | 0.1081 |
| 0.9169 | 0.3240 | 3300 | 0.8863 | 0.0077 | 0.1153 | 0.6625 | 0.1153 |
| 0.891 | 0.3338 | 3400 | 0.8877 | 0.0077 | 0.1060 | 0.6604 | 0.1060 |
| 0.8145 | 0.3436 | 3500 | 0.8715 | 0.0077 | 0.1216 | 0.6656 | 0.1216 |
| 0.9315 | 0.3534 | 3600 | 0.8797 | 0.0077 | 0.1169 | 0.6672 | 0.1169 |
| 0.8221 | 0.3632 | 3700 | 0.8665 | 0.0077 | 0.1172 | 0.6654 | 0.1172 |
| 0.8791 | 0.3731 | 3800 | 0.8571 | 0.0077 | 0.1232 | 0.6716 | 0.1232 |
| 0.8133 | 0.3829 | 3900 | 0.8550 | 0.0077 | 0.1257 | 0.6736 | 0.1257 |
| 0.9025 | 0.3927 | 4000 | 0.8761 | 0.0077 | 0.1203 | 0.6675 | 0.1203 |
| 0.8351 | 0.4025 | 4100 | 0.8434 | 0.0077 | 0.1210 | 0.6773 | 0.1210 |
| 0.822 | 0.4123 | 4200 | 0.8683 | 0.0077 | 0.1233 | 0.6696 | 0.1233 |
| 0.8318 | 0.4221 | 4300 | 0.8381 | 0.0077 | 0.1328 | 0.6805 | 0.1328 |
| 0.8889 | 0.4320 | 4400 | 0.8390 | 0.0077 | 0.1286 | 0.6771 | 0.1286 |
| 0.8065 | 0.4418 | 4500 | 0.8356 | 0.0077 | 0.1330 | 0.6813 | 0.1330 |
| 0.7774 | 0.4516 | 4600 | 0.8270 | 0.0077 | 0.1309 | 0.6849 | 0.1309 |
| 0.8561 | 0.4614 | 4700 | 0.8368 | 0.0077 | 0.1227 | 0.6794 | 0.1227 |
| 0.8441 | 0.4712 | 4800 | 0.8304 | 0.0077 | 0.1290 | 0.6846 | 0.1290 |
| 0.8299 | 0.4811 | 4900 | 0.8409 | 0.0077 | 0.1373 | 0.6813 | 0.1373 |
| 0.8636 | 0.4909 | 5000 | 0.8305 | 0.0077 | 0.1356 | 0.6828 | 0.1356 |
| 0.8331 | 0.5007 | 5100 | 0.8344 | 0.0077 | 0.1323 | 0.6815 | 0.1323 |
| 0.836 | 0.5105 | 5200 | 0.8442 | 0.0077 | 0.1272 | 0.6779 | 0.1272 |
| 0.8766 | 0.5203 | 5300 | 0.8371 | 0.0077 | 0.1249 | 0.6814 | 0.1249 |
| 0.8162 | 0.5301 | 5400 | 0.8371 | 0.0077 | 0.1240 | 0.6807 | 0.1240 |
| 0.7921 | 0.5400 | 5500 | 0.8223 | 0.0077 | 0.1343 | 0.6865 | 0.1343 |
| 0.7942 | 0.5498 | 5600 | 0.8296 | 0.0077 | 0.1232 | 0.6830 | 0.1232 |
| 0.7946 | 0.5596 | 5700 | 0.8457 | 0.0077 | 0.1232 | 0.6773 | 0.1232 |
| 0.8734 | 0.5694 | 5800 | 0.8413 | 0.0077 | 0.1243 | 0.6789 | 0.1243 |
| 0.8549 | 0.5792 | 5900 | 0.8234 | 0.0077 | 0.1319 | 0.6825 | 0.1319 |
| 0.8043 | 0.5890 | 6000 | 0.8351 | 0.0077 | 0.1222 | 0.6796 | 0.1222 |
| 0.7978 | 0.5989 | 6100 | 0.8191 | 0.0077 | 0.1319 | 0.6859 | 0.1319 |
| 0.8612 | 0.6087 | 6200 | 0.8391 | 0.0077 | 0.1273 | 0.6809 | 0.1273 |
| 0.8694 | 0.6185 | 6300 | 0.8459 | 0.0077 | 0.1236 | 0.6746 | 0.1236 |
| 0.7745 | 0.6283 | 6400 | 0.8288 | 0.0077 | 0.1254 | 0.6826 | 0.1254 |
| 0.8127 | 0.6381 | 6500 | 0.8308 | 0.0077 | 0.1288 | 0.6835 | 0.1288 |
| 0.8932 | 0.6479 | 6600 | 0.8228 | 0.0077 | 0.1277 | 0.6870 | 0.1277 |
| 0.7347 | 0.6578 | 6700 | 0.8444 | 0.0077 | 0.1238 | 0.6784 | 0.1238 |
| 0.8916 | 0.6676 | 6800 | 0.8367 | 0.0077 | 0.1229 | 0.6785 | 0.1229 |
| 0.8266 | 0.6774 | 6900 | 0.8429 | 0.0077 | 0.1090 | 0.6765 | 0.1090 |
| 0.8434 | 0.6872 | 7000 | 0.8761 | 0.0077 | 0.1111 | 0.6663 | 0.1111 |
| 0.9014 | 0.6970 | 7100 | 0.8662 | 0.0077 | 0.1118 | 0.6691 | 0.1118 |
| 0.8635 | 0.7069 | 7200 | 0.8907 | 0.0077 | 0.1031 | 0.6571 | 0.1031 |
| 0.8141 | 0.7167 | 7300 | 0.8601 | 0.0077 | 0.1106 | 0.6697 | 0.1106 |
| 0.8412 | 0.7265 | 7400 | 0.8633 | 0.0077 | 0.1105 | 0.6689 | 0.1105 |
| 0.8099 | 0.7363 | 7500 | 0.8548 | 0.0077 | 0.1145 | 0.6730 | 0.1145 |
| 0.8797 | 0.7461 | 7600 | 0.8901 | 0.0077 | 0.0986 | 0.6579 | 0.0986 |
| 0.868 | 0.7559 | 7700 | 0.8813 | 0.0077 | 0.1093 | 0.6631 | 0.1093 |
| 0.9255 | 0.7658 | 7800 | 0.8970 | 0.0077 | 0.0969 | 0.6562 | 0.0969 |
| 0.852 | 0.7756 | 7900 | 0.8818 | 0.0077 | 0.1071 | 0.6605 | 0.1071 |
| 0.877 | 0.7854 | 8000 | 0.8657 | 0.0077 | 0.1096 | 0.6682 | 0.1096 |
| 0.8785 | 0.7952 | 8100 | 0.9094 | 0.0077 | 0.0981 | 0.6510 | 0.0981 |
| 0.886 | 0.8050 | 8200 | 0.8944 | 0.0077 | 0.0980 | 0.6568 | 0.0980 |
| 0.8839 | 0.8148 | 8300 | 0.8853 | 0.0077 | 0.1000 | 0.6599 | 0.1000 |
| 0.9859 | 0.8247 | 8400 | 0.9503 | 0.0077 | 0.0826 | 0.6377 | 0.0826 |
| 0.8999 | 0.8345 | 8500 | 0.9457 | 0.0077 | 0.0902 | 0.6391 | 0.0902 |
| 0.8964 | 0.8443 | 8600 | 0.9139 | 0.0077 | 0.0958 | 0.6479 | 0.0958 |
| 0.9995 | 0.8541 | 8700 | 1.0455 | 0.0077 | 0.0686 | 0.6082 | 0.0686 |
| 0.9112 | 0.8639 | 8800 | 0.9424 | 0.0077 | 0.0852 | 0.6378 | 0.0852 |
| 1.0122 | 0.8737 | 8900 | 0.9558 | 0.0077 | 0.0813 | 0.6346 | 0.0813 |
| 0.9506 | 0.8836 | 9000 | 0.9808 | 0.0077 | 0.0786 | 0.6226 | 0.0786 |
| 0.914 | 0.8934 | 9100 | 0.9319 | 0.0077 | 0.0915 | 0.6397 | 0.0915 |
| 1.0686 | 0.9032 | 9200 | 1.0701 | 0.0077 | 0.0703 | 0.5960 | 0.0703 |
| 0.9708 | 0.9130 | 9300 | 0.9579 | 0.0077 | 0.0784 | 0.6361 | 0.0784 |
| 0.9476 | 0.9228 | 9400 | 0.9569 | 0.0077 | 0.0794 | 0.6348 | 0.0794 |
| 1.0408 | 0.9327 | 9500 | 0.9947 | 0.0077 | 0.0775 | 0.6219 | 0.0775 |
| 1.1124 | 0.9425 | 9600 | 1.0990 | 0.0077 | 0.0548 | 0.5791 | 0.0548 |
| 1.0363 | 0.9523 | 9700 | 0.9816 | 0.0077 | 0.0744 | 0.6219 | 0.0744 |
| 1.0304 | 0.9621 | 9800 | 1.0399 | 0.0077 | 0.0707 | 0.6010 | 0.0707 |
| 1.0132 | 0.9719 | 9900 | 0.9934 | 0.0077 | 0.0723 | 0.6163 | 0.0723 |
| 1.008 | 0.9817 | 10000 | 1.0073 | 0.0077 | 0.0647 | 0.6130 | 0.0647 |
| 1.0327 | 0.9916 | 10100 | 1.0336 | 0.0077 | 0.0655 | 0.6056 | 0.0655 |
| 0.9572 | 1.0014 | 10200 | 0.9807 | 0.0077 | 0.0766 | 0.6248 | 0.0766 |
| 1.029 | 1.0112 | 10300 | 0.9938 | 0.0077 | 0.0698 | 0.6196 | 0.0698 |
| 1.0337 | 1.0210 | 10400 | 1.0443 | 0.0077 | 0.0584 | 0.5985 | 0.0584 |
| 1.1123 | 1.0308 | 10500 | 1.1204 | 0.0077 | 0.0429 | 0.5711 | 0.0429 |
| 1.0404 | 1.0406 | 10600 | 1.0257 | 0.0077 | 0.0670 | 0.6060 | 0.0670 |
| 1.0153 | 1.0505 | 10700 | 1.0338 | 0.0077 | 0.0582 | 0.6008 | 0.0582 |
| 1.0272 | 1.0603 | 10800 | 1.0352 | 0.0077 | 0.0601 | 0.6043 | 0.0601 |
| 1.0774 | 1.0701 | 10900 | 1.0505 | 0.0077 | 0.0574 | 0.5995 | 0.0574 |
| 1.1947 | 1.0799 | 11000 | 1.2107 | 0.0077 | 0.0283 | 0.5330 | 0.0283 |
| 1.1199 | 1.0897 | 11100 | 1.1641 | 0.0077 | 0.0333 | 0.5499 | 0.0333 |
| 1.3132 | 1.0995 | 11200 | 1.2765 | 0.0077 | 0.0209 | 0.5053 | 0.0209 |
| 1.2289 | 1.1094 | 11300 | 1.2151 | 0.0077 | 0.0266 | 0.5289 | 0.0266 |
| 1.1027 | 1.1192 | 11400 | 1.0854 | 0.0077 | 0.0556 | 0.5808 | 0.0556 |
| 1.2052 | 1.1290 | 11500 | 1.1876 | 0.0077 | 0.0311 | 0.5399 | 0.0311 |
| 1.2153 | 1.1388 | 11600 | 1.2377 | 0.0077 | 0.0231 | 0.5189 | 0.0231 |
| 1.3525 | 1.1486 | 11700 | 1.2869 | 0.0077 | 0.0214 | 0.5000 | 0.0214 |
| 1.1622 | 1.1585 | 11800 | 1.1486 | 0.0077 | 0.0346 | 0.5545 | 0.0346 |
| 1.2546 | 1.1683 | 11900 | 1.2287 | 0.0077 | 0.0265 | 0.5235 | 0.0265 |
| 1.3096 | 1.1781 | 12000 | 1.2968 | 0.0077 | 0.0168 | 0.4910 | 0.0168 |
| 1.2852 | 1.1879 | 12100 | 1.3003 | 0.0077 | 0.0184 | 0.4944 | 0.0184 |
| 1.2349 | 1.1977 | 12200 | 1.3078 | 0.0077 | 0.0184 | 0.4898 | 0.0184 |
| 1.2638 | 1.2075 | 12300 | 1.3049 | 0.0077 | 0.0163 | 0.4904 | 0.0163 |
| 1.3416 | 1.2174 | 12400 | 1.3620 | 0.0077 | 0.0124 | 0.4631 | 0.0124 |
| 1.3584 | 1.2272 | 12500 | 1.3432 | 0.0077 | 0.0135 | 0.4689 | 0.0135 |
| 1.2882 | 1.2370 | 12600 | 1.3707 | 0.0077 | 0.0099 | 0.4649 | 0.0099 |
| 1.3471 | 1.2468 | 12700 | 1.3535 | 0.0077 | 0.0110 | 0.4689 | 0.0110 |
| 1.2909 | 1.2566 | 12800 | 1.2995 | 0.0077 | 0.0171 | 0.4917 | 0.0171 |
| 1.2564 | 1.2664 | 12900 | 1.2927 | 0.0077 | 0.0177 | 0.4866 | 0.0177 |
| 1.3576 | 1.2763 | 13000 | 1.2970 | 0.0077 | 0.0174 | 0.4929 | 0.0174 |
| 1.3452 | 1.2861 | 13100 | 1.3345 | 0.0077 | 0.0136 | 0.4770 | 0.0136 |
| 1.3693 | 1.2959 | 13200 | 1.3076 | 0.0077 | 0.0157 | 0.4826 | 0.0157 |
| 1.2878 | 1.3057 | 13300 | 1.2919 | 0.0077 | 0.0135 | 0.4880 | 0.0135 |
| 1.3614 | 1.3155 | 13400 | 1.3213 | 0.0077 | 0.0117 | 0.4748 | 0.0117 |
| 1.3648 | 1.3253 | 13500 | 1.3867 | 0.0077 | 0.0099 | 0.4554 | 0.0099 |
| 1.4133 | 1.3352 | 13600 | 1.3886 | 0.0077 | 0.0112 | 0.4508 | 0.0112 |
| 1.3351 | 1.3450 | 13700 | 1.3368 | 0.0077 | 0.0099 | 0.4746 | 0.0099 |
| 1.3443 | 1.3548 | 13800 | 1.3662 | 0.0077 | 0.0119 | 0.4586 | 0.0119 |
| 1.3227 | 1.3646 | 13900 | 1.3512 | 0.0077 | 0.0079 | 0.4650 | 0.0079 |
| 1.3895 | 1.3744 | 14000 | 1.3767 | 0.0077 | 0.0086 | 0.4603 | 0.0086 |
| 1.4071 | 1.3843 | 14100 | 1.4137 | 0.0077 | 0.0081 | 0.4395 | 0.0081 |
| 1.4262 | 1.3941 | 14200 | 1.3941 | 0.0077 | 0.0057 | 0.4481 | 0.0057 |
| 1.3962 | 1.4039 | 14300 | 1.4028 | 0.0077 | 0.0063 | 0.4502 | 0.0063 |
| 1.4506 | 1.4137 | 14400 | 1.4613 | 0.0077 | 0.0057 | 0.4240 | 0.0057 |
| 1.4948 | 1.4235 | 14500 | 1.4799 | 0.0077 | 0.0096 | 0.4163 | 0.0096 |
| 1.5515 | 1.4333 | 14600 | 1.5476 | 0.0077 | 0.0069 | 0.3877 | 0.0069 |
| 1.5008 | 1.4432 | 14700 | 1.5322 | 0.0077 | 0.0003 | 0.3938 | 0.0003 |
| 1.5891 | 1.4530 | 14800 | 1.5518 | 0.0077 | 0.0019 | 0.3739 | 0.0019 |
| 1.6054 | 1.4628 | 14900 | 1.5898 | 0.0077 | 0.0123 | 0.3797 | 0.0123 |
| 1.5161 | 1.4726 | 15000 | 1.5112 | 0.0077 | 0.0001 | 0.3993 | 0.0001 |
Framework versions
- PEFT 0.15.2
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for donoway/ij6lz8iv_20250703_005007
Base model
meta-llama/Llama-3.2-1B