tlocvsdyspneaTask-Llama-3.1-8B-Instruct-all
This model is a fine-tuned version of meta-llama/Llama-3.1-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.0692
- F1 Micro: 0.6479
- F1 Macro: 0.3623
- F1 Weighted: 0.7617
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-07 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
| Training Loss | Epoch | Step | Validation Loss | F1 Micro | F1 Macro | F1 Weighted |
|---|---|---|---|---|---|---|
| 3.6424 | 0.1067 | 2 | 3.6506 | 0.0 | 0.0 | 0.0 |
| 3.6424 | 0.2133 | 4 | 3.3802 | 0.0 | 0.0 | 0.0 |
| 3.5625 | 0.32 | 6 | 3.0251 | 0.0 | 0.0 | 0.0 |
| 3.5625 | 0.4267 | 8 | 2.8197 | 0.0 | 0.0 | 0.0 |
| 2.9442 | 0.5333 | 10 | 2.6243 | 0.0214 | 0.0003 | 0.0406 |
| 2.9442 | 0.64 | 12 | 2.5062 | 0.3988 | 0.0630 | 0.4046 |
| 2.9442 | 0.7467 | 14 | 2.4347 | 0.4377 | 0.1422 | 0.4839 |
| 2.5527 | 0.8533 | 16 | 2.3839 | 0.5992 | 0.1997 | 0.6002 |
| 2.5527 | 0.96 | 18 | 2.3376 | 0.4144 | 0.0244 | 0.5029 |
| 2.3739 | 1.0533 | 20 | 2.3149 | 0.5389 | 0.1572 | 0.6990 |
| 2.3739 | 1.16 | 22 | 2.2759 | 0.6012 | 0.2032 | 0.6069 |
| 2.3739 | 1.2667 | 24 | 2.2436 | 0.6031 | 0.2816 | 0.6463 |
| 2.2679 | 1.3733 | 26 | 2.2140 | 0.5039 | 0.3382 | 0.6293 |
| 2.2679 | 1.48 | 28 | 2.1863 | 0.5681 | 0.3631 | 0.7245 |
| 2.1772 | 1.5867 | 30 | 2.1643 | 0.6089 | 0.3461 | 0.6164 |
| 2.1772 | 1.6933 | 32 | 2.1454 | 0.6070 | 0.3429 | 0.6127 |
| 2.1772 | 1.8 | 34 | 2.1285 | 0.6128 | 0.3183 | 0.6980 |
| 2.0808 | 1.9067 | 36 | 2.1136 | 0.5350 | 0.3511 | 0.6545 |
| 2.0808 | 2.0 | 38 | 2.1013 | 0.5953 | 0.3784 | 0.7423 |
| 2.1278 | 2.1067 | 40 | 2.0913 | 0.6187 | 0.2931 | 0.6656 |
| 2.1278 | 2.2133 | 42 | 2.0839 | 0.6148 | 0.2684 | 0.6302 |
| 2.1278 | 2.32 | 44 | 2.0783 | 0.6187 | 0.2833 | 0.6520 |
| 2.0394 | 2.4267 | 46 | 2.0741 | 0.6479 | 0.3312 | 0.7236 |
| 2.0394 | 2.5333 | 48 | 2.0715 | 0.6576 | 0.3595 | 0.7615 |
| 2.01 | 2.64 | 50 | 2.0700 | 0.6537 | 0.3629 | 0.7642 |
| 2.01 | 2.7467 | 52 | 2.0693 | 0.6440 | 0.3629 | 0.7611 |
| 2.01 | 2.8533 | 54 | 2.0692 | 0.6479 | 0.3623 | 0.7617 |
Framework versions
- PEFT 0.18.1
- Transformers 4.51.0
- Pytorch 2.8.0+cu128
- Datasets 3.6.0
- Tokenizers 0.21.0
- Downloads last month
- 31
Model tree for ferrazzipietro/tlocvsdyspneaTask-Llama-3.1-8B-Instruct-all
Base model
meta-llama/Llama-3.1-8B Finetuned
meta-llama/Llama-3.1-8B-Instruct