ARC-Easy_Llama-3.2-1B-8984x13s
This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.0916
- Model Preparation Time: 0.0059
- Mdl: 2542.3237
- Accumulated Loss: 1762.2045
- Correct Preds: 378.0
- Total Preds: 570.0
- Accuracy: 0.6632
- Correct Gen Preds: 378.0
- Gen Accuracy: 0.6632
- Correct Gen Preds 32: 131.0
- Correct Preds 32: 131.0
- Total Labels 32: 158.0
- Accuracy 32: 0.8291
- Gen Accuracy 32: 0.8291
- Correct Gen Preds 33: 96.0
- Correct Preds 33: 96.0
- Total Labels 33: 152.0
- Accuracy 33: 0.6316
- Gen Accuracy 33: 0.6316
- Correct Gen Preds 34: 85.0
- Correct Preds 34: 85.0
- Total Labels 34: 142.0
- Accuracy 34: 0.5986
- Gen Accuracy 34: 0.5986
- Correct Gen Preds 35: 66.0
- Correct Preds 35: 66.0
- Total Labels 35: 118.0
- Accuracy 35: 0.5593
- Gen Accuracy 35: 0.5593
- Correct Gen Preds 36: 0.0
- Correct Preds 36: 0.0
- Total Labels 36: 0.0
- Accuracy 36: 0.0
- Gen Accuracy 36: 0.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 64
- eval_batch_size: 112
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- lr_scheduler_warmup_ratio: 0.001
- num_epochs: 100
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Mdl | Accumulated Loss | Correct Preds | Total Preds | Accuracy | Correct Gen Preds | Gen Accuracy | Correct Gen Preds 32 | Correct Preds 32 | Total Labels 32 | Accuracy 32 | Gen Accuracy 32 | Correct Gen Preds 33 | Correct Preds 33 | Total Labels 33 | Accuracy 33 | Gen Accuracy 33 | Correct Gen Preds 34 | Correct Preds 34 | Total Labels 34 | Accuracy 34 | Gen Accuracy 34 | Correct Gen Preds 35 | Correct Preds 35 | Total Labels 35 | Accuracy 35 | Gen Accuracy 35 | Correct Gen Preds 36 | Correct Preds 36 | Total Labels 36 | Accuracy 36 | Gen Accuracy 36 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 1.5354 | 0.0059 | 1262.6022 | 875.1692 | 172.0 | 570.0 | 0.3018 | 170.0 | 0.2982 | 154.0 | 154.0 | 158.0 | 0.9747 | 0.9747 | 0.0 | 0.0 | 152.0 | 0.0 | 0.0 | 15.0 | 17.0 | 142.0 | 0.1197 | 0.1056 | 1.0 | 1.0 | 118.0 | 0.0085 | 0.0085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1.435 | 1.0 | 1 | 2.1188 | 0.0059 | 1742.3509 | 1207.7056 | 183.0 | 570.0 | 0.3211 | 183.0 | 0.3211 | 0.0 | 0.0 | 158.0 | 0.0 | 0.0 | 151.0 | 151.0 | 152.0 | 0.9934 | 0.9934 | 32.0 | 32.0 | 142.0 | 0.2254 | 0.2254 | 0.0 | 0.0 | 118.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2.005 | 2.0 | 2 | 1.3228 | 0.0059 | 1087.7696 | 753.9844 | 166.0 | 570.0 | 0.2912 | 166.0 | 0.2912 | 157.0 | 157.0 | 158.0 | 0.9937 | 0.9937 | 7.0 | 7.0 | 152.0 | 0.0461 | 0.0461 | 2.0 | 2.0 | 142.0 | 0.0141 | 0.0141 | 0.0 | 0.0 | 118.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.9558 | 3.0 | 3 | 2.0441 | 0.0059 | 1680.9188 | 1165.1241 | 219.0 | 570.0 | 0.3842 | 219.0 | 0.3842 | 154.0 | 154.0 | 158.0 | 0.9747 | 0.9747 | 3.0 | 3.0 | 152.0 | 0.0197 | 0.0197 | 44.0 | 44.0 | 142.0 | 0.3099 | 0.3099 | 18.0 | 18.0 | 118.0 | 0.1525 | 0.1525 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.4507 | 4.0 | 4 | 1.4535 | 0.0059 | 1195.2711 | 828.4988 | 345.0 | 570.0 | 0.6053 | 344.0 | 0.6035 | 132.0 | 133.0 | 158.0 | 0.8418 | 0.8354 | 77.0 | 77.0 | 152.0 | 0.5066 | 0.5066 | 86.0 | 86.0 | 142.0 | 0.6056 | 0.6056 | 49.0 | 49.0 | 118.0 | 0.4153 | 0.4153 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0879 | 5.0 | 5 | 1.6615 | 0.0059 | 1366.3466 | 947.0793 | 377.0 | 570.0 | 0.6614 | 374.0 | 0.6561 | 115.0 | 117.0 | 158.0 | 0.7405 | 0.7278 | 110.0 | 110.0 | 152.0 | 0.7237 | 0.7237 | 92.0 | 93.0 | 142.0 | 0.6549 | 0.6479 | 57.0 | 57.0 | 118.0 | 0.4831 | 0.4831 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0036 | 6.0 | 6 | 3.0916 | 0.0059 | 2542.3237 | 1762.2045 | 378.0 | 570.0 | 0.6632 | 378.0 | 0.6632 | 131.0 | 131.0 | 158.0 | 0.8291 | 0.8291 | 96.0 | 96.0 | 152.0 | 0.6316 | 0.6316 | 85.0 | 85.0 | 142.0 | 0.5986 | 0.5986 | 66.0 | 66.0 | 118.0 | 0.5593 | 0.5593 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 7.0 | 7 | 4.4610 | 0.0059 | 3668.4395 | 2542.7685 | 369.0 | 570.0 | 0.6474 | 369.0 | 0.6474 | 136.0 | 136.0 | 158.0 | 0.8608 | 0.8608 | 83.0 | 83.0 | 152.0 | 0.5461 | 0.5461 | 85.0 | 85.0 | 142.0 | 0.5986 | 0.5986 | 65.0 | 65.0 | 118.0 | 0.5508 | 0.5508 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 8.0 | 8 | 5.2835 | 0.0059 | 4344.8373 | 3011.6117 | 364.0 | 570.0 | 0.6386 | 364.0 | 0.6386 | 137.0 | 137.0 | 158.0 | 0.8671 | 0.8671 | 81.0 | 81.0 | 152.0 | 0.5329 | 0.5329 | 81.0 | 81.0 | 142.0 | 0.5704 | 0.5704 | 65.0 | 65.0 | 118.0 | 0.5508 | 0.5508 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 9.0 | 9 | 5.7743 | 0.0059 | 4748.4422 | 3291.3693 | 360.0 | 570.0 | 0.6316 | 360.0 | 0.6316 | 138.0 | 138.0 | 158.0 | 0.8734 | 0.8734 | 79.0 | 79.0 | 152.0 | 0.5197 | 0.5197 | 80.0 | 80.0 | 142.0 | 0.5634 | 0.5634 | 63.0 | 63.0 | 118.0 | 0.5339 | 0.5339 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 10.0 | 10 | 6.0608 | 0.0059 | 4984.0139 | 3454.6552 | 355.0 | 570.0 | 0.6228 | 355.0 | 0.6228 | 139.0 | 139.0 | 158.0 | 0.8797 | 0.8797 | 76.0 | 76.0 | 152.0 | 0.5 | 0.5 | 79.0 | 79.0 | 142.0 | 0.5563 | 0.5563 | 61.0 | 61.0 | 118.0 | 0.5169 | 0.5169 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 11.0 | 11 | 6.2689 | 0.0059 | 5155.1789 | 3573.2977 | 353.0 | 570.0 | 0.6193 | 353.0 | 0.6193 | 137.0 | 137.0 | 158.0 | 0.8671 | 0.8671 | 77.0 | 77.0 | 152.0 | 0.5066 | 0.5066 | 79.0 | 79.0 | 142.0 | 0.5563 | 0.5563 | 60.0 | 60.0 | 118.0 | 0.5085 | 0.5085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 12.0 | 12 | 6.3825 | 0.0059 | 5248.5200 | 3637.9969 | 352.0 | 570.0 | 0.6175 | 352.0 | 0.6175 | 138.0 | 138.0 | 158.0 | 0.8734 | 0.8734 | 75.0 | 75.0 | 152.0 | 0.4934 | 0.4934 | 78.0 | 78.0 | 142.0 | 0.5493 | 0.5493 | 61.0 | 61.0 | 118.0 | 0.5169 | 0.5169 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 13.0 | 13 | 6.4953 | 0.0059 | 5341.3420 | 3702.3361 | 349.0 | 570.0 | 0.6123 | 349.0 | 0.6123 | 139.0 | 139.0 | 158.0 | 0.8797 | 0.8797 | 74.0 | 74.0 | 152.0 | 0.4868 | 0.4868 | 78.0 | 78.0 | 142.0 | 0.5493 | 0.5493 | 58.0 | 58.0 | 118.0 | 0.4915 | 0.4915 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 14.0 | 14 | 6.5606 | 0.0059 | 5395.0053 | 3739.5327 | 352.0 | 570.0 | 0.6175 | 352.0 | 0.6175 | 141.0 | 141.0 | 158.0 | 0.8924 | 0.8924 | 75.0 | 75.0 | 152.0 | 0.4934 | 0.4934 | 76.0 | 76.0 | 142.0 | 0.5352 | 0.5352 | 60.0 | 60.0 | 118.0 | 0.5085 | 0.5085 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 15.0 | 15 | 6.6424 | 0.0059 | 5462.2665 | 3786.1546 | 345.0 | 570.0 | 0.6053 | 345.0 | 0.6053 | 139.0 | 139.0 | 158.0 | 0.8797 | 0.8797 | 73.0 | 73.0 | 152.0 | 0.4803 | 0.4803 | 75.0 | 75.0 | 142.0 | 0.5282 | 0.5282 | 58.0 | 58.0 | 118.0 | 0.4915 | 0.4915 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 0.0 | 16.0 | 16 | 6.7109 | 0.0059 | 5518.6330 | 3825.2249 | 344.0 | 570.0 | 0.6035 | 344.0 | 0.6035 | 140.0 | 140.0 | 158.0 | 0.8861 | 0.8861 | 71.0 | 71.0 | 152.0 | 0.4671 | 0.4671 | 75.0 | 75.0 | 142.0 | 0.5282 | 0.5282 | 58.0 | 58.0 | 118.0 | 0.4915 | 0.4915 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.0
- Tokenizers 0.21.1
- Downloads last month
- 2
Model tree for donoway/ARC-Easy_Llama-3.2-1B-8984x13s
Base model
meta-llama/Llama-3.2-1B