sp8c5ube_20250704_054132
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3226
- Model Preparation Time: 0.0073
- Move Accuracy: 0.0224
- Token Accuracy: 0.4882
- Accuracy: 0.0224
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 128
- eval_batch_size: 256
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Move Accuracy | Token Accuracy | Accuracy |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.9474 | 0.0073 | 0.0 | 0.0000 | 0.0 |
| 1.9312 | 0.0098 | 100 | 1.9194 | 0.0073 | 0.0 | 0.3009 | 0.0 |
| 1.6203 | 0.0196 | 200 | 1.5682 | 0.0073 | 0.0048 | 0.4038 | 0.0048 |
| 1.4105 | 0.0295 | 300 | 1.4473 | 0.0073 | 0.0072 | 0.4483 | 0.0072 |
| 1.3616 | 0.0393 | 400 | 1.4166 | 0.0073 | 0.0083 | 0.4623 | 0.0083 |
| 1.3804 | 0.0491 | 500 | 1.4147 | 0.0073 | 0.0113 | 0.4574 | 0.0113 |
| 1.3128 | 0.0589 | 600 | 1.3607 | 0.0073 | 0.0170 | 0.4852 | 0.0170 |
| 1.3059 | 0.0687 | 700 | 1.3382 | 0.0073 | 0.0223 | 0.4847 | 0.0223 |
| 1.3691 | 0.0785 | 800 | 1.3226 | 0.0073 | 0.0224 | 0.4882 | 0.0224 |
| 1.3448 | 0.0884 | 900 | 1.3287 | 0.0073 | 0.0217 | 0.4871 | 0.0217 |
| 1.3536 | 0.0982 | 1000 | 1.3308 | 0.0073 | 0.0209 | 0.4902 | 0.0209 |
| 1.3373 | 0.1080 | 1100 | 1.3697 | 0.0073 | 0.0120 | 0.4641 | 0.0120 |
| 1.3782 | 0.1178 | 1200 | 1.4015 | 0.0073 | 0.0093 | 0.4481 | 0.0093 |
| 1.4286 | 0.1276 | 1300 | 1.4546 | 0.0073 | 0.0062 | 0.4216 | 0.0062 |
| 1.4455 | 0.1374 | 1400 | 1.4331 | 0.0073 | 0.0101 | 0.4413 | 0.0101 |
| 1.4574 | 0.1473 | 1500 | 1.4610 | 0.0073 | 0.0021 | 0.4296 | 0.0021 |
| 1.4723 | 0.1571 | 1600 | 1.4598 | 0.0073 | 0.0093 | 0.4294 | 0.0093 |
| 1.4745 | 0.1669 | 1700 | 1.4699 | 0.0073 | 0.0011 | 0.4237 | 0.0011 |
| 1.4965 | 0.1767 | 1800 | 1.4578 | 0.0073 | 0.0008 | 0.4317 | 0.0008 |
| 1.4686 | 0.1865 | 1900 | 1.4662 | 0.0073 | 0.0043 | 0.4295 | 0.0043 |
| 1.496 | 0.1963 | 2000 | 1.5088 | 0.0073 | 0.0019 | 0.4005 | 0.0019 |
| 1.4544 | 0.2062 | 2100 | 1.4992 | 0.0073 | 0.0013 | 0.4283 | 0.0013 |
| 1.6416 | 0.2160 | 2200 | 1.7915 | 0.0073 | 0.0006 | 0.4122 | 0.0006 |
| 1.4699 | 0.2258 | 2300 | 1.4794 | 0.0073 | 0.0015 | 0.4161 | 0.0015 |
| 1.4593 | 0.2356 | 2400 | 1.5046 | 0.0073 | 0.0006 | 0.4090 | 0.0006 |
| 1.5657 | 0.2454 | 2500 | 1.6012 | 0.0073 | 0.0020 | 0.3708 | 0.0020 |
| 1.5002 | 0.2553 | 2600 | 1.5058 | 0.0073 | 0.0023 | 0.3969 | 0.0023 |
| 1.5475 | 0.2651 | 2700 | 1.4845 | 0.0073 | 0.0014 | 0.4130 | 0.0014 |
| 1.4719 | 0.2749 | 2800 | 1.4968 | 0.0073 | 0.0005 | 0.4151 | 0.0005 |
| 1.5452 | 0.2847 | 2900 | 1.5158 | 0.0073 | 0.0031 | 0.4081 | 0.0031 |
| 1.5271 | 0.2945 | 3000 | 1.5216 | 0.0073 | 0.0039 | 0.3983 | 0.0039 |
| 1.5239 | 0.3043 | 3100 | 1.5196 | 0.0073 | 0.0050 | 0.3953 | 0.0050 |
| 1.5689 | 0.3142 | 3200 | 1.5587 | 0.0073 | 0.0010 | 0.3756 | 0.0010 |
| 1.5987 | 0.3240 | 3300 | 1.5907 | 0.0073 | 0.0 | 0.3825 | 0.0 |
| 1.5561 | 0.3338 | 3400 | 1.5664 | 0.0073 | 0.0011 | 0.3881 | 0.0011 |
| 1.7552 | 0.3436 | 3500 | 1.6471 | 0.0073 | 0.0014 | 0.3727 | 0.0014 |
| 1.6028 | 0.3534 | 3600 | 1.5572 | 0.0073 | 0.0001 | 0.3839 | 0.0001 |
| 1.6229 | 0.3632 | 3700 | 1.6017 | 0.0073 | 0.0008 | 0.3654 | 0.0008 |
| 1.6584 | 0.3731 | 3800 | 1.6346 | 0.0073 | 0.0001 | 0.3430 | 0.0001 |
| 1.6547 | 0.3829 | 3900 | 1.6626 | 0.0073 | 0.0006 | 0.3339 | 0.0006 |
| 1.6585 | 0.3927 | 4000 | 1.6531 | 0.0073 | 0.0019 | 0.3477 | 0.0019 |
| 1.6937 | 0.4025 | 4100 | 1.7651 | 0.0073 | 0.0004 | 0.3169 | 0.0004 |
| 1.9022 | 0.4123 | 4200 | 1.8046 | 0.0073 | 0.0026 | 0.3190 | 0.0026 |
| 1.7366 | 0.4221 | 4300 | 1.6926 | 0.0073 | 0.0001 | 0.3200 | 0.0001 |
| 1.7184 | 0.4320 | 4400 | 1.6880 | 0.0073 | 0.0001 | 0.3283 | 0.0001 |
| 2.8305 | 0.4418 | 4500 | 2.6708 | 0.0073 | 0.0 | 0.2852 | 0.0 |
| 1.7172 | 0.4516 | 4600 | 1.7272 | 0.0073 | 0.0 | 0.3162 | 0.0 |
| 1.745 | 0.4614 | 4700 | 1.7135 | 0.0073 | 0.0001 | 0.3090 | 0.0001 |
| 1.6741 | 0.4712 | 4800 | 1.6860 | 0.0073 | 0.0006 | 0.3174 | 0.0006 |
| 1.993 | 0.4811 | 4900 | 1.9271 | 0.0073 | 0.0 | 0.3131 | 0.0 |
| 1.9858 | 0.4909 | 5000 | 2.0271 | 0.0073 | 0.0 | 0.2912 | 0.0 |
| 1.8205 | 0.5007 | 5100 | 1.8124 | 0.0073 | 0.0013 | 0.3181 | 0.0013 |
| 1.8928 | 0.5105 | 5200 | 1.9005 | 0.0073 | 0.0 | 0.2761 | 0.0 |
| 2.4873 | 0.5203 | 5300 | 2.5025 | 0.0073 | 0.0001 | 0.2905 | 0.0001 |
| 2.4698 | 0.5301 | 5400 | 2.4292 | 0.0073 | 0.0001 | 0.2387 | 0.0001 |
| 3.4609 | 0.5400 | 5500 | 3.4086 | 0.0073 | 0.0 | 0.1785 | 0.0 |
| 2.8612 | 0.5498 | 5600 | 2.8092 | 0.0073 | 0.0 | 0.2403 | 0.0 |
| 3.0534 | 0.5596 | 5700 | 2.9958 | 0.0073 | 0.0 | 0.2257 | 0.0 |
| 2.8117 | 0.5694 | 5800 | 2.7432 | 0.0073 | 0.0 | 0.2464 | 0.0 |
| 2.8308 | 0.5792 | 5900 | 2.8334 | 0.0073 | 0.0 | 0.2283 | 0.0 |
| 2.8562 | 0.5890 | 6000 | 2.8715 | 0.0073 | 0.0 | 0.2294 | 0.0 |
| 2.9246 | 0.5989 | 6100 | 2.9644 | 0.0073 | 0.0 | 0.2283 | 0.0 |
| 2.9032 | 0.6087 | 6200 | 2.8049 | 0.0073 | 0.0 | 0.2251 | 0.0 |
| 2.5625 | 0.6185 | 6300 | 2.5415 | 0.0073 | 0.0 | 0.2625 | 0.0 |
| 2.8001 | 0.6283 | 6400 | 2.8893 | 0.0073 | 0.0 | 0.2089 | 0.0 |
| 2.1103 | 0.6381 | 6500 | 2.1438 | 0.0073 | 0.0 | 0.2844 | 0.0 |
| 2.0969 | 0.6479 | 6600 | 2.1154 | 0.0073 | 0.0 | 0.2742 | 0.0 |
| 2.0586 | 0.6578 | 6700 | 2.0911 | 0.0073 | 0.0003 | 0.2778 | 0.0003 |
| 2.7066 | 0.6676 | 6800 | 2.6681 | 0.0073 | 0.0002 | 0.1950 | 0.0002 |
| 2.4837 | 0.6774 | 6900 | 2.4185 | 0.0073 | 0.0 | 0.2413 | 0.0 |
| 2.4207 | 0.6872 | 7000 | 2.4044 | 0.0073 | 0.0 | 0.2378 | 0.0 |
| 3.8965 | 0.6970 | 7100 | 3.7736 | 0.0073 | 0.0 | 0.1145 | 0.0 |
| 2.6576 | 0.7069 | 7200 | 2.6991 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 2.5831 | 0.7167 | 7300 | 2.5307 | 0.0073 | 0.0 | 0.2226 | 0.0 |
| 2.674 | 0.7265 | 7400 | 2.6872 | 0.0073 | 0.0 | 0.2288 | 0.0 |
| 2.5252 | 0.7363 | 7500 | 2.5391 | 0.0073 | 0.0 | 0.2428 | 0.0 |
| 3.0841 | 0.7461 | 7600 | 3.0805 | 0.0073 | 0.0 | 0.2049 | 0.0 |
| 2.7263 | 0.7559 | 7700 | 2.6443 | 0.0073 | 0.0 | 0.2250 | 0.0 |
| 3.0839 | 0.7658 | 7800 | 3.0037 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 2.5636 | 0.7756 | 7900 | 2.6383 | 0.0073 | 0.0 | 0.2271 | 0.0 |
| 2.8372 | 0.7854 | 8000 | 2.8946 | 0.0073 | 0.0 | 0.2096 | 0.0 |
| 3.1089 | 0.7952 | 8100 | 3.0479 | 0.0073 | 0.0 | 0.1994 | 0.0 |
| 3.1362 | 0.8050 | 8200 | 3.1365 | 0.0073 | 0.0 | 0.0684 | 0.0 |
| 3.4041 | 0.8148 | 8300 | 3.3619 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0832 | 0.8247 | 8400 | 3.1708 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.1129 | 0.8345 | 8500 | 3.1225 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.1416 | 0.8443 | 8600 | 3.1455 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.1415 | 0.8541 | 8700 | 3.1349 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0733 | 0.8639 | 8800 | 3.1355 | 0.0073 | 0.0 | 0.1979 | 0.0 |
| 3.0882 | 0.8737 | 8900 | 3.0955 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0742 | 0.8836 | 9000 | 3.1122 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0877 | 0.8934 | 9100 | 3.0896 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0397 | 0.9032 | 9200 | 3.0190 | 0.0073 | 0.0 | 0.2230 | 0.0 |
| 2.8139 | 0.9130 | 9300 | 2.8608 | 0.0073 | 0.0 | 0.2146 | 0.0 |
| 2.7204 | 0.9228 | 9400 | 2.7613 | 0.0073 | 0.0 | 0.2349 | 0.0 |
| 2.6826 | 0.9327 | 9500 | 2.6460 | 0.0073 | 0.0 | 0.2358 | 0.0 |
| 2.4663 | 0.9425 | 9600 | 2.5760 | 0.0073 | 0.0 | 0.2299 | 0.0 |
| 2.7077 | 0.9523 | 9700 | 2.8069 | 0.0073 | 0.0 | 0.2280 | 0.0 |
| 3.0325 | 0.9621 | 9800 | 3.0167 | 0.0073 | 0.0 | 0.2271 | 0.0 |
| 2.9419 | 0.9719 | 9900 | 2.9357 | 0.0073 | 0.0 | 0.2312 | 0.0 |
| 3.1628 | 0.9817 | 10000 | 3.0980 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.1252 | 0.9916 | 10100 | 3.0875 | 0.0073 | 0.0 | 0.1996 | 0.0 |
| 3.0366 | 1.0014 | 10200 | 3.0355 | 0.0073 | 0.0 | 0.2146 | 0.0 |
| 2.9611 | 1.0112 | 10300 | 3.0042 | 0.0073 | 0.0 | 0.2192 | 0.0 |
| 2.8488 | 1.0210 | 10400 | 2.8309 | 0.0073 | 0.0 | 0.2314 | 0.0 |
| 2.8259 | 1.0308 | 10500 | 2.8546 | 0.0073 | 0.0 | 0.2315 | 0.0 |
| 3.8309 | 1.0406 | 10600 | 3.7479 | 0.0073 | 0.0 | 0.0930 | 0.0 |
| 2.9807 | 1.0505 | 10700 | 3.0202 | 0.0073 | 0.0 | 0.2312 | 0.0 |
| 3.1027 | 1.0603 | 10800 | 3.0921 | 0.0073 | 0.0 | 0.2005 | 0.0 |
| 3.0084 | 1.0701 | 10900 | 3.0183 | 0.0073 | 0.0 | 0.1996 | 0.0 |
Framework versions
- PEFT 0.15.2
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for donoway/sp8c5ube_20250704_054132
Base model
meta-llama/Llama-3.2-1B