9wfyra0b_20250704_062257
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.2573
- Model Preparation Time: 0.0074
- Move Accuracy: 0.0358
- Token Accuracy: 0.5162
- Accuracy: 0.0358
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 128
- eval_batch_size: 256
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Move Accuracy | Token Accuracy | Accuracy |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.9474 | 0.0074 | 0.0 | 0.0000 | 0.0 |
| 2.0741 | 0.0098 | 100 | 2.0669 | 0.0074 | 0.0 | 0.2899 | 0.0 |
| 1.6221 | 0.0196 | 200 | 1.5850 | 0.0074 | 0.0014 | 0.3916 | 0.0014 |
| 1.4171 | 0.0295 | 300 | 1.4636 | 0.0074 | 0.0101 | 0.4386 | 0.0101 |
| 1.4071 | 0.0393 | 400 | 1.4423 | 0.0074 | 0.0059 | 0.4491 | 0.0059 |
| 1.4032 | 0.0491 | 500 | 1.4473 | 0.0074 | 0.0057 | 0.4442 | 0.0057 |
| 1.3363 | 0.0589 | 600 | 1.3770 | 0.0074 | 0.0143 | 0.4723 | 0.0143 |
| 1.3311 | 0.0687 | 700 | 1.3505 | 0.0074 | 0.0175 | 0.4790 | 0.0175 |
| 1.3916 | 0.0785 | 800 | 1.3336 | 0.0074 | 0.0206 | 0.4871 | 0.0206 |
| 1.3491 | 0.0884 | 900 | 1.3084 | 0.0074 | 0.0303 | 0.4980 | 0.0303 |
| 1.3256 | 0.0982 | 1000 | 1.2832 | 0.0074 | 0.0325 | 0.5110 | 0.0325 |
| 1.2446 | 0.1080 | 1100 | 1.2564 | 0.0074 | 0.0292 | 0.5139 | 0.0292 |
| 1.2226 | 0.1178 | 1200 | 1.2568 | 0.0074 | 0.0327 | 0.5144 | 0.0327 |
| 1.2696 | 0.1276 | 1300 | 1.2573 | 0.0074 | 0.0358 | 0.5162 | 0.0358 |
| 1.2862 | 0.1374 | 1400 | 1.2470 | 0.0074 | 0.0358 | 0.5224 | 0.0358 |
| 1.2048 | 0.1473 | 1500 | 1.2561 | 0.0074 | 0.0328 | 0.5126 | 0.0328 |
| 1.2501 | 0.1571 | 1600 | 1.2335 | 0.0074 | 0.0293 | 0.5241 | 0.0293 |
| 1.2162 | 0.1669 | 1700 | 1.2556 | 0.0074 | 0.0213 | 0.5096 | 0.0213 |
| 1.3023 | 0.1767 | 1800 | 1.2740 | 0.0074 | 0.0266 | 0.5055 | 0.0266 |
| 1.2681 | 0.1865 | 1900 | 1.2767 | 0.0074 | 0.0212 | 0.4993 | 0.0212 |
| 1.2607 | 0.1963 | 2000 | 1.3250 | 0.0074 | 0.0136 | 0.4774 | 0.0136 |
| 1.3486 | 0.2062 | 2100 | 1.3507 | 0.0074 | 0.0115 | 0.4715 | 0.0115 |
| 1.2942 | 0.2160 | 2200 | 1.3128 | 0.0074 | 0.0182 | 0.4874 | 0.0182 |
| 1.3231 | 0.2258 | 2300 | 1.3370 | 0.0074 | 0.0130 | 0.4721 | 0.0130 |
| 1.2832 | 0.2356 | 2400 | 1.3434 | 0.0074 | 0.0126 | 0.4728 | 0.0126 |
| 1.42 | 0.2454 | 2500 | 1.3990 | 0.0074 | 0.0103 | 0.4475 | 0.0103 |
| 1.3296 | 0.2553 | 2600 | 1.3522 | 0.0074 | 0.0106 | 0.4678 | 0.0106 |
| 1.4406 | 0.2651 | 2700 | 1.3627 | 0.0074 | 0.0114 | 0.4635 | 0.0114 |
| 1.3454 | 0.2749 | 2800 | 1.3653 | 0.0074 | 0.0123 | 0.4640 | 0.0123 |
| 1.3896 | 0.2847 | 2900 | 1.4172 | 0.0074 | 0.0057 | 0.4401 | 0.0057 |
| 1.3921 | 0.2945 | 3000 | 1.4247 | 0.0074 | 0.0061 | 0.4393 | 0.0061 |
| 1.4116 | 0.3043 | 3100 | 1.4440 | 0.0074 | 0.0098 | 0.4359 | 0.0098 |
| 1.3953 | 0.3142 | 3200 | 1.4025 | 0.0074 | 0.0075 | 0.4503 | 0.0075 |
| 1.4678 | 0.3240 | 3300 | 1.4430 | 0.0074 | 0.0060 | 0.4324 | 0.0060 |
| 1.4274 | 0.3338 | 3400 | 1.4276 | 0.0074 | 0.0055 | 0.4334 | 0.0055 |
| 1.3542 | 0.3436 | 3500 | 1.4004 | 0.0074 | 0.0068 | 0.4539 | 0.0068 |
| 1.4465 | 0.3534 | 3600 | 1.3942 | 0.0074 | 0.0110 | 0.4476 | 0.0110 |
| 1.6036 | 0.3632 | 3700 | 1.5489 | 0.0074 | 0.0041 | 0.4132 | 0.0041 |
| 1.5619 | 0.3731 | 3800 | 1.4989 | 0.0074 | 0.0032 | 0.4145 | 0.0032 |
| 1.4819 | 0.3829 | 3900 | 1.4754 | 0.0074 | 0.0023 | 0.4211 | 0.0023 |
| 1.5182 | 0.3927 | 4000 | 1.5109 | 0.0074 | 0.0037 | 0.4096 | 0.0037 |
| 1.4923 | 0.4025 | 4100 | 1.4620 | 0.0074 | 0.0010 | 0.4313 | 0.0010 |
| 1.4593 | 0.4123 | 4200 | 1.4598 | 0.0074 | 0.0021 | 0.4271 | 0.0021 |
| 1.4272 | 0.4221 | 4300 | 1.4452 | 0.0074 | 0.0075 | 0.4365 | 0.0075 |
| 1.4757 | 0.4320 | 4400 | 1.4999 | 0.0074 | 0.0026 | 0.3937 | 0.0026 |
| 1.3887 | 0.4418 | 4500 | 1.4396 | 0.0074 | 0.0056 | 0.4328 | 0.0056 |
| 1.4579 | 0.4516 | 4600 | 1.4978 | 0.0074 | 0.0054 | 0.4207 | 0.0054 |
| 1.5531 | 0.4614 | 4700 | 1.4748 | 0.0074 | 0.0024 | 0.4111 | 0.0024 |
| 1.6443 | 0.4712 | 4800 | 1.6405 | 0.0074 | 0.0003 | 0.3757 | 0.0003 |
| 1.4766 | 0.4811 | 4900 | 1.4867 | 0.0074 | 0.0014 | 0.4219 | 0.0014 |
| 1.4943 | 0.4909 | 5000 | 1.5025 | 0.0074 | 0.0 | 0.4213 | 0.0 |
| 1.5107 | 0.5007 | 5100 | 1.4837 | 0.0074 | 0.0015 | 0.4242 | 0.0015 |
| 1.5203 | 0.5105 | 5200 | 1.5355 | 0.0074 | 0.0026 | 0.4036 | 0.0026 |
| 1.517 | 0.5203 | 5300 | 1.4971 | 0.0074 | 0.0177 | 0.4137 | 0.0177 |
| 1.524 | 0.5301 | 5400 | 1.5188 | 0.0074 | 0.0087 | 0.4142 | 0.0087 |
| 1.4964 | 0.5400 | 5500 | 1.5081 | 0.0074 | 0.0012 | 0.3999 | 0.0012 |
| 1.4914 | 0.5498 | 5600 | 1.5011 | 0.0074 | 0.0003 | 0.4185 | 0.0003 |
| 1.5336 | 0.5596 | 5700 | 1.4869 | 0.0074 | 0.0034 | 0.4107 | 0.0034 |
| 1.5501 | 0.5694 | 5800 | 1.5910 | 0.0074 | 0.0001 | 0.3796 | 0.0001 |
| 1.5159 | 0.5792 | 5900 | 1.4990 | 0.0074 | 0.0093 | 0.4158 | 0.0093 |
| 1.4398 | 0.5890 | 6000 | 1.4857 | 0.0074 | 0.0001 | 0.4236 | 0.0001 |
| 1.4165 | 0.5989 | 6100 | 1.4875 | 0.0074 | 0.0036 | 0.4107 | 0.0036 |
| 1.6139 | 0.6087 | 6200 | 1.5330 | 0.0074 | 0.0155 | 0.4105 | 0.0155 |
| 1.5475 | 0.6185 | 6300 | 1.5142 | 0.0074 | 0.0004 | 0.3904 | 0.0004 |
| 1.5182 | 0.6283 | 6400 | 1.5312 | 0.0074 | 0.0037 | 0.3958 | 0.0037 |
| 1.5258 | 0.6381 | 6500 | 1.5266 | 0.0074 | 0.0028 | 0.4018 | 0.0028 |
| 1.5594 | 0.6479 | 6600 | 1.5551 | 0.0074 | 0.0010 | 0.3890 | 0.0010 |
| 1.5129 | 0.6578 | 6700 | 1.5294 | 0.0074 | 0.0003 | 0.3957 | 0.0003 |
| 1.4895 | 0.6676 | 6800 | 1.5249 | 0.0074 | 0.0096 | 0.3992 | 0.0096 |
| 1.6066 | 0.6774 | 6900 | 1.6091 | 0.0074 | 0.0018 | 0.3763 | 0.0018 |
| 1.5437 | 0.6872 | 7000 | 1.5364 | 0.0074 | 0.0002 | 0.3898 | 0.0002 |
| 1.5101 | 0.6970 | 7100 | 1.5487 | 0.0074 | 0.0023 | 0.3901 | 0.0023 |
| 1.4922 | 0.7069 | 7200 | 1.4956 | 0.0074 | 0.0010 | 0.4125 | 0.0010 |
| 1.5337 | 0.7167 | 7300 | 1.5299 | 0.0074 | 0.0010 | 0.4030 | 0.0010 |
| 1.6514 | 0.7265 | 7400 | 1.6805 | 0.0074 | 0.0001 | 0.3269 | 0.0001 |
| 1.6732 | 0.7363 | 7500 | 1.6633 | 0.0074 | 0.0019 | 0.3275 | 0.0019 |
| 1.6529 | 0.7461 | 7600 | 1.6324 | 0.0074 | 0.0006 | 0.3388 | 0.0006 |
| 1.6916 | 0.7559 | 7700 | 1.6727 | 0.0074 | 0.0 | 0.3157 | 0.0 |
| 1.6458 | 0.7658 | 7800 | 1.6502 | 0.0074 | 0.0006 | 0.3392 | 0.0006 |
| 1.6089 | 0.7756 | 7900 | 1.6667 | 0.0074 | 0.0 | 0.3391 | 0.0 |
| 1.6123 | 0.7854 | 8000 | 1.6255 | 0.0074 | 0.0001 | 0.3570 | 0.0001 |
| 1.5791 | 0.7952 | 8100 | 1.5495 | 0.0074 | 0.0003 | 0.3894 | 0.0003 |
| 1.6209 | 0.8050 | 8200 | 1.6087 | 0.0074 | 0.0003 | 0.3648 | 0.0003 |
| 1.6183 | 0.8148 | 8300 | 1.6222 | 0.0074 | 0.0 | 0.3587 | 0.0 |
| 1.6458 | 0.8247 | 8400 | 1.6654 | 0.0074 | 0.0010 | 0.3210 | 0.0010 |
| 1.6275 | 0.8345 | 8500 | 1.6586 | 0.0074 | 0.0020 | 0.3581 | 0.0020 |
| 1.5902 | 0.8443 | 8600 | 1.6505 | 0.0074 | 0.0014 | 0.3648 | 0.0014 |
| 1.6093 | 0.8541 | 8700 | 1.6349 | 0.0074 | 0.0001 | 0.3440 | 0.0001 |
| 1.6329 | 0.8639 | 8800 | 1.7043 | 0.0074 | 0.0017 | 0.3251 | 0.0017 |
| 1.614 | 0.8737 | 8900 | 1.6203 | 0.0074 | 0.0001 | 0.3590 | 0.0001 |
| 1.6498 | 0.8836 | 9000 | 1.6505 | 0.0074 | 0.0003 | 0.3463 | 0.0003 |
| 1.6358 | 0.8934 | 9100 | 1.6291 | 0.0074 | 0.0014 | 0.3488 | 0.0014 |
| 1.6299 | 0.9032 | 9200 | 1.6218 | 0.0074 | 0.0023 | 0.3538 | 0.0023 |
| 1.6681 | 0.9130 | 9300 | 1.6483 | 0.0074 | 0.0004 | 0.3509 | 0.0004 |
| 1.774 | 0.9228 | 9400 | 1.6950 | 0.0074 | 0.0009 | 0.3337 | 0.0009 |
| 1.7024 | 0.9327 | 9500 | 1.6986 | 0.0074 | 0.0045 | 0.3451 | 0.0045 |
| 1.8358 | 0.9425 | 9600 | 1.7668 | 0.0074 | 0.0009 | 0.3246 | 0.0009 |
| 1.7094 | 0.9523 | 9700 | 1.7084 | 0.0074 | 0.0004 | 0.3150 | 0.0004 |
| 3.3185 | 0.9621 | 9800 | 3.1530 | 0.0074 | 0.0 | 0.1271 | 0.0 |
| 3.0506 | 0.9719 | 9900 | 2.8487 | 0.0074 | 0.0 | 0.2546 | 0.0 |
| 2.3583 | 0.9817 | 10000 | 2.3895 | 0.0074 | 0.0 | 0.2746 | 0.0 |
| 2.0678 | 0.9916 | 10100 | 2.0981 | 0.0074 | 0.0001 | 0.2902 | 0.0001 |
| 2.4401 | 1.0014 | 10200 | 2.4153 | 0.0074 | 0.0 | 0.2683 | 0.0 |
| 2.2007 | 1.0112 | 10300 | 2.2611 | 0.0074 | 0.0 | 0.2723 | 0.0 |
| 2.1573 | 1.0210 | 10400 | 2.1095 | 0.0074 | 0.0 | 0.2803 | 0.0 |
| 1.9783 | 1.0308 | 10500 | 1.9434 | 0.0074 | 0.0003 | 0.3005 | 0.0003 |
| 1.945 | 1.0406 | 10600 | 1.8734 | 0.0074 | 0.0 | 0.2991 | 0.0 |
| 1.9442 | 1.0505 | 10700 | 1.8755 | 0.0074 | 0.0008 | 0.2997 | 0.0008 |
| 2.2458 | 1.0603 | 10800 | 2.2427 | 0.0074 | 0.0001 | 0.2633 | 0.0001 |
| 1.8778 | 1.0701 | 10900 | 1.8802 | 0.0074 | 0.0002 | 0.2971 | 0.0002 |
| 3.2181 | 1.0799 | 11000 | 3.2509 | 0.0074 | 0.0 | 0.2583 | 0.0 |
| 1.7926 | 1.0897 | 11100 | 1.7533 | 0.0074 | 0.0006 | 0.3177 | 0.0006 |
| 1.8302 | 1.0995 | 11200 | 1.8581 | 0.0074 | 0.0027 | 0.3114 | 0.0027 |
| 1.7122 | 1.1094 | 11300 | 1.7173 | 0.0074 | 0.0006 | 0.3231 | 0.0006 |
| 1.7273 | 1.1192 | 11400 | 1.7829 | 0.0074 | 0.0 | 0.3166 | 0.0 |
Framework versions
- PEFT 0.15.2
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for donoway/9wfyra0b_20250704_062257
Base model
meta-llama/Llama-3.2-1B