djhe6jbg_20250703_005116
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.2143
- Model Preparation Time: 0.0074
- Move Accuracy: 0.0455
- Token Accuracy: 0.5364
- Accuracy: 0.0455
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 128
- eval_batch_size: 256
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant_with_warmup
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Move Accuracy | Token Accuracy | Accuracy |
|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 11.9474 | 0.0074 | 0.0 | 0.0000 | 0.0 |
| 2.0157 | 0.0098 | 100 | 2.0089 | 0.0074 | 0.0 | 0.2901 | 0.0 |
| 1.6256 | 0.0196 | 200 | 1.5846 | 0.0074 | 0.0017 | 0.3931 | 0.0017 |
| 1.4207 | 0.0295 | 300 | 1.4706 | 0.0074 | 0.0092 | 0.4363 | 0.0092 |
| 1.398 | 0.0393 | 400 | 1.4319 | 0.0074 | 0.0064 | 0.4509 | 0.0064 |
| 1.3752 | 0.0491 | 500 | 1.4388 | 0.0074 | 0.0054 | 0.4465 | 0.0054 |
| 1.3227 | 0.0589 | 600 | 1.3719 | 0.0074 | 0.0145 | 0.4756 | 0.0145 |
| 1.3241 | 0.0687 | 700 | 1.3465 | 0.0074 | 0.0219 | 0.4867 | 0.0219 |
| 1.371 | 0.0785 | 800 | 1.2961 | 0.0074 | 0.0251 | 0.5039 | 0.0251 |
| 1.3004 | 0.0884 | 900 | 1.2782 | 0.0074 | 0.0362 | 0.5105 | 0.0362 |
| 1.2851 | 0.0982 | 1000 | 1.2555 | 0.0074 | 0.0383 | 0.5240 | 0.0383 |
| 1.2218 | 0.1080 | 1100 | 1.2423 | 0.0074 | 0.0360 | 0.5230 | 0.0360 |
| 1.1708 | 0.1178 | 1200 | 1.2143 | 0.0074 | 0.0455 | 0.5364 | 0.0455 |
| 1.2302 | 0.1276 | 1300 | 1.2261 | 0.0074 | 0.0391 | 0.5286 | 0.0391 |
| 1.2296 | 0.1374 | 1400 | 1.1958 | 0.0074 | 0.0454 | 0.5413 | 0.0454 |
| 1.1594 | 0.1473 | 1500 | 1.1922 | 0.0074 | 0.0434 | 0.5389 | 0.0434 |
| 1.1819 | 0.1571 | 1600 | 1.1748 | 0.0074 | 0.0440 | 0.5464 | 0.0440 |
| 1.1816 | 0.1669 | 1700 | 1.1907 | 0.0074 | 0.0385 | 0.5421 | 0.0385 |
| 1.2329 | 0.1767 | 1800 | 1.2008 | 0.0074 | 0.0365 | 0.5342 | 0.0365 |
| 1.1924 | 0.1865 | 1900 | 1.2365 | 0.0074 | 0.0376 | 0.5257 | 0.0376 |
| 1.2229 | 0.1963 | 2000 | 1.2549 | 0.0074 | 0.0289 | 0.5170 | 0.0289 |
| 1.2797 | 0.2062 | 2100 | 1.2585 | 0.0074 | 0.0243 | 0.5123 | 0.0243 |
| 1.2481 | 0.2160 | 2200 | 1.2757 | 0.0074 | 0.0237 | 0.5048 | 0.0237 |
| 1.2933 | 0.2258 | 2300 | 1.2790 | 0.0074 | 0.0220 | 0.5040 | 0.0220 |
| 1.2856 | 0.2356 | 2400 | 1.3083 | 0.0074 | 0.0211 | 0.4923 | 0.0211 |
| 1.2652 | 0.2454 | 2500 | 1.3136 | 0.0074 | 0.0171 | 0.4852 | 0.0171 |
| 1.3379 | 0.2553 | 2600 | 1.3522 | 0.0074 | 0.0168 | 0.4729 | 0.0168 |
| 1.3673 | 0.2651 | 2700 | 1.2990 | 0.0074 | 0.0188 | 0.4929 | 0.0188 |
| 1.3531 | 0.2749 | 2800 | 1.3397 | 0.0074 | 0.0138 | 0.4767 | 0.0138 |
| 1.3551 | 0.2847 | 2900 | 1.3503 | 0.0074 | 0.0157 | 0.4708 | 0.0157 |
| 1.3334 | 0.2945 | 3000 | 1.3547 | 0.0074 | 0.0157 | 0.4751 | 0.0157 |
| 1.3557 | 0.3043 | 3100 | 1.3492 | 0.0074 | 0.0124 | 0.4666 | 0.0124 |
| 1.3141 | 0.3142 | 3200 | 1.3621 | 0.0074 | 0.0094 | 0.4667 | 0.0094 |
| 1.37 | 0.3240 | 3300 | 1.3371 | 0.0074 | 0.0142 | 0.4767 | 0.0142 |
| 1.3872 | 0.3338 | 3400 | 1.3709 | 0.0074 | 0.0101 | 0.4658 | 0.0101 |
| 1.3363 | 0.3436 | 3500 | 1.3854 | 0.0074 | 0.0075 | 0.4634 | 0.0075 |
| 1.3863 | 0.3534 | 3600 | 1.3569 | 0.0074 | 0.0118 | 0.4718 | 0.0118 |
| 1.4216 | 0.3632 | 3700 | 1.4036 | 0.0074 | 0.0110 | 0.4548 | 0.0110 |
| 1.525 | 0.3731 | 3800 | 1.4169 | 0.0074 | 0.0055 | 0.4527 | 0.0055 |
| 1.3574 | 0.3829 | 3900 | 1.3701 | 0.0074 | 0.0080 | 0.4632 | 0.0080 |
| 1.4042 | 0.3927 | 4000 | 1.3877 | 0.0074 | 0.0089 | 0.4598 | 0.0089 |
| 1.3915 | 0.4025 | 4100 | 1.3768 | 0.0074 | 0.0090 | 0.4610 | 0.0090 |
| 1.3443 | 0.4123 | 4200 | 1.3714 | 0.0074 | 0.0103 | 0.4668 | 0.0103 |
| 1.3511 | 0.4221 | 4300 | 1.3641 | 0.0074 | 0.0098 | 0.4668 | 0.0098 |
| 1.4042 | 0.4320 | 4400 | 1.3922 | 0.0074 | 0.0104 | 0.4602 | 0.0104 |
| 1.3922 | 0.4418 | 4500 | 1.4300 | 0.0074 | 0.0060 | 0.4424 | 0.0060 |
| 1.3716 | 0.4516 | 4600 | 1.3906 | 0.0074 | 0.0076 | 0.4561 | 0.0076 |
| 1.4219 | 0.4614 | 4700 | 1.4061 | 0.0074 | 0.0074 | 0.4543 | 0.0074 |
| 1.3569 | 0.4712 | 4800 | 1.3691 | 0.0074 | 0.0104 | 0.4635 | 0.0104 |
| 1.3941 | 0.4811 | 4900 | 1.3786 | 0.0074 | 0.0090 | 0.4563 | 0.0090 |
| 1.4025 | 0.4909 | 5000 | 1.4050 | 0.0074 | 0.0112 | 0.4548 | 0.0112 |
| 1.4366 | 0.5007 | 5100 | 1.3970 | 0.0074 | 0.0068 | 0.4465 | 0.0068 |
| 1.4288 | 0.5105 | 5200 | 1.3994 | 0.0074 | 0.0033 | 0.4376 | 0.0033 |
| 1.4049 | 0.5203 | 5300 | 1.3799 | 0.0074 | 0.0073 | 0.4616 | 0.0073 |
| 1.423 | 0.5301 | 5400 | 1.4107 | 0.0074 | 0.0059 | 0.4402 | 0.0059 |
| 1.3489 | 0.5400 | 5500 | 1.3724 | 0.0074 | 0.0088 | 0.4623 | 0.0088 |
| 1.4091 | 0.5498 | 5600 | 1.4107 | 0.0074 | 0.0061 | 0.4470 | 0.0061 |
| 1.4608 | 0.5596 | 5700 | 1.4380 | 0.0074 | 0.0154 | 0.4368 | 0.0154 |
| 1.4084 | 0.5694 | 5800 | 1.3916 | 0.0074 | 0.0048 | 0.4471 | 0.0048 |
| 1.4378 | 0.5792 | 5900 | 1.4394 | 0.0074 | 0.0065 | 0.4384 | 0.0065 |
| 1.4155 | 0.5890 | 6000 | 1.4155 | 0.0074 | 0.0065 | 0.4418 | 0.0065 |
| 1.365 | 0.5989 | 6100 | 1.3972 | 0.0074 | 0.0029 | 0.4507 | 0.0029 |
| 1.5122 | 0.6087 | 6200 | 1.4784 | 0.0074 | 0.0087 | 0.4250 | 0.0087 |
| 1.4433 | 0.6185 | 6300 | 1.4300 | 0.0074 | 0.0071 | 0.4439 | 0.0071 |
| 1.4481 | 0.6283 | 6400 | 1.4222 | 0.0074 | 0.0088 | 0.4434 | 0.0088 |
| 1.4281 | 0.6381 | 6500 | 1.4182 | 0.0074 | 0.0048 | 0.4469 | 0.0048 |
| 1.4555 | 0.6479 | 6600 | 1.4571 | 0.0074 | 0.0062 | 0.4280 | 0.0062 |
| 1.4657 | 0.6578 | 6700 | 1.4892 | 0.0074 | 0.0026 | 0.4192 | 0.0026 |
| 1.3898 | 0.6676 | 6800 | 1.4332 | 0.0074 | 0.0083 | 0.4288 | 0.0083 |
| 1.4377 | 0.6774 | 6900 | 1.4684 | 0.0074 | 0.0027 | 0.4333 | 0.0027 |
| 1.4236 | 0.6872 | 7000 | 1.4526 | 0.0074 | 0.0040 | 0.4334 | 0.0040 |
| 1.4659 | 0.6970 | 7100 | 1.4707 | 0.0074 | 0.0041 | 0.4175 | 0.0041 |
| 1.5052 | 0.7069 | 7200 | 1.4845 | 0.0074 | 0.0027 | 0.4253 | 0.0027 |
| 1.4115 | 0.7167 | 7300 | 1.4329 | 0.0074 | 0.0069 | 0.4412 | 0.0069 |
| 1.4636 | 0.7265 | 7400 | 1.4223 | 0.0074 | 0.0077 | 0.4476 | 0.0077 |
| 1.4773 | 0.7363 | 7500 | 1.4970 | 0.0074 | 0.0052 | 0.4250 | 0.0052 |
| 1.5174 | 0.7461 | 7600 | 1.5017 | 0.0074 | 0.0112 | 0.4234 | 0.0112 |
| 1.4704 | 0.7559 | 7700 | 1.4422 | 0.0074 | 0.0037 | 0.4394 | 0.0037 |
| 1.4522 | 0.7658 | 7800 | 1.4526 | 0.0074 | 0.0010 | 0.4386 | 0.0010 |
| 1.504 | 0.7756 | 7900 | 1.4909 | 0.0074 | 0.0029 | 0.4244 | 0.0029 |
| 1.4326 | 0.7854 | 8000 | 1.4623 | 0.0074 | 0.0010 | 0.4327 | 0.0010 |
| 1.4714 | 0.7952 | 8100 | 1.4544 | 0.0074 | 0.0081 | 0.4331 | 0.0081 |
| 1.4769 | 0.8050 | 8200 | 1.4697 | 0.0074 | 0.0003 | 0.4339 | 0.0003 |
| 1.4163 | 0.8148 | 8300 | 1.4577 | 0.0074 | 0.0091 | 0.4264 | 0.0091 |
| 1.482 | 0.8247 | 8400 | 1.4774 | 0.0074 | 0.0021 | 0.4272 | 0.0021 |
| 1.6308 | 0.8345 | 8500 | 1.6580 | 0.0074 | 0.0006 | 0.3726 | 0.0006 |
| 1.4722 | 0.8443 | 8600 | 1.4757 | 0.0074 | 0.0010 | 0.4197 | 0.0010 |
| 1.4501 | 0.8541 | 8700 | 1.4416 | 0.0074 | 0.0144 | 0.4391 | 0.0144 |
| 1.4409 | 0.8639 | 8800 | 1.4411 | 0.0074 | 0.0063 | 0.4397 | 0.0063 |
| 1.4815 | 0.8737 | 8900 | 1.4653 | 0.0074 | 0.0032 | 0.4224 | 0.0032 |
| 1.4301 | 0.8836 | 9000 | 1.4732 | 0.0074 | 0.0028 | 0.4290 | 0.0028 |
| 1.5293 | 0.8934 | 9100 | 1.5193 | 0.0074 | 0.0044 | 0.4259 | 0.0044 |
| 1.579 | 0.9032 | 9200 | 1.5869 | 0.0074 | 0.0033 | 0.3773 | 0.0033 |
| 1.5038 | 0.9130 | 9300 | 1.4978 | 0.0074 | 0.0037 | 0.4155 | 0.0037 |
| 1.5571 | 0.9228 | 9400 | 1.5047 | 0.0074 | 0.0017 | 0.4102 | 0.0017 |
| 1.4909 | 0.9327 | 9500 | 1.4641 | 0.0074 | 0.0067 | 0.4309 | 0.0067 |
| 1.4799 | 0.9425 | 9600 | 1.4688 | 0.0074 | 0.0128 | 0.4202 | 0.0128 |
| 1.5231 | 0.9523 | 9700 | 1.4726 | 0.0074 | 0.0068 | 0.4145 | 0.0068 |
| 1.4619 | 0.9621 | 9800 | 1.4949 | 0.0074 | 0.0 | 0.4256 | 0.0 |
| 1.5021 | 0.9719 | 9900 | 1.4950 | 0.0074 | 0.0026 | 0.4132 | 0.0026 |
| 1.4817 | 0.9817 | 10000 | 1.4763 | 0.0074 | 0.0017 | 0.4258 | 0.0017 |
| 1.5196 | 0.9916 | 10100 | 1.4806 | 0.0074 | 0.0037 | 0.4283 | 0.0037 |
| 1.5065 | 1.0014 | 10200 | 1.5551 | 0.0074 | 0.0010 | 0.4033 | 0.0010 |
| 1.7228 | 1.0112 | 10300 | 1.6458 | 0.0074 | 0.0010 | 0.4028 | 0.0010 |
| 1.6053 | 1.0210 | 10400 | 1.5763 | 0.0074 | 0.0027 | 0.3990 | 0.0027 |
| 1.6125 | 1.0308 | 10500 | 1.5800 | 0.0074 | 0.0018 | 0.3836 | 0.0018 |
| 1.7443 | 1.0406 | 10600 | 1.7851 | 0.0074 | 0.0 | 0.3415 | 0.0 |
| 1.638 | 1.0505 | 10700 | 1.6648 | 0.0074 | 0.0 | 0.3549 | 0.0 |
| 1.6893 | 1.0603 | 10800 | 1.6618 | 0.0074 | 0.0002 | 0.3560 | 0.0002 |
| 1.627 | 1.0701 | 10900 | 1.6422 | 0.0074 | 0.0048 | 0.3506 | 0.0048 |
| 1.6322 | 1.0799 | 11000 | 1.6717 | 0.0074 | 0.0006 | 0.3517 | 0.0006 |
| 1.8335 | 1.0897 | 11100 | 1.9045 | 0.0074 | 0.0003 | 0.3218 | 0.0003 |
| 1.6412 | 1.0995 | 11200 | 1.6336 | 0.0074 | 0.0003 | 0.3531 | 0.0003 |
| 1.661 | 1.1094 | 11300 | 1.6760 | 0.0074 | 0.0014 | 0.3511 | 0.0014 |
Framework versions
- PEFT 0.15.2
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for donoway/djhe6jbg_20250703_005116
Base model
meta-llama/Llama-3.2-1B