Mistral-7B-v0.1_cola_sparse_swiglu_ignore_0_1
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4277
- Accuracy: {'accuracy': 0.8212616822429907}
- Matthews Correlation: 0.5699
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 2
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Matthews Correlation |
|---|---|---|---|---|---|
| 1.6028 | 0.17 | 20 | 1.5539 | {'accuracy': 0.5417066155321189} | 0.0764 |
| 0.9736 | 0.33 | 40 | 0.9708 | {'accuracy': 0.6864813039309684} | 0.1852 |
| 0.7146 | 0.5 | 60 | 0.7850 | {'accuracy': 0.713326941514861} | 0.3141 |
| 0.6892 | 0.66 | 80 | 0.6674 | {'accuracy': 0.7238734419942474} | 0.3498 |
| 0.6792 | 0.83 | 100 | 0.6401 | {'accuracy': 0.7411313518696069} | 0.3977 |
| 0.6233 | 1.0 | 120 | 0.6104 | {'accuracy': 0.7574304889741131} | 0.3784 |
| 0.4778 | 1.16 | 140 | 0.5641 | {'accuracy': 0.7948226270373921} | 0.4874 |
| 0.4792 | 1.33 | 160 | 0.5961 | {'accuracy': 0.7746883988494727} | 0.4284 |
| 0.5573 | 1.49 | 180 | 0.5210 | {'accuracy': 0.8034515819750719} | 0.5126 |
| 0.4464 | 1.66 | 200 | 0.5716 | {'accuracy': 0.7871524448705657} | 0.5601 |
| 0.4541 | 1.83 | 220 | 0.5130 | {'accuracy': 0.8015340364333653} | 0.5046 |
| 0.4989 | 1.99 | 240 | 0.4648 | {'accuracy': 0.8149568552253116} | 0.5452 |
| 0.3891 | 2.16 | 260 | 0.4566 | {'accuracy': 0.8207094918504314} | 0.5856 |
| 0.336 | 2.32 | 280 | 0.4516 | {'accuracy': 0.822627037392138} | 0.5657 |
| 0.3854 | 2.49 | 300 | 0.4224 | {'accuracy': 0.8322147651006712} | 0.6066 |
| 0.3917 | 2.66 | 320 | 0.4247 | {'accuracy': 0.837967401725791} | 0.6125 |
| 0.3779 | 2.82 | 340 | 0.4177 | {'accuracy': 0.8302972195589645} | 0.5897 |
| 0.3462 | 2.99 | 360 | 0.4649 | {'accuracy': 0.8207094918504314} | 0.5584 |
| 0.3448 | 3.15 | 380 | 0.4182 | {'accuracy': 0.8293384467881112} | 0.5837 |
| 0.3894 | 3.32 | 400 | 0.4388 | {'accuracy': 0.8302972195589645} | 0.5893 |
Framework versions
- PEFT 0.7.1
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0
- Downloads last month
- 11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for thrunlab/Mistral-7B-v0.1_cola_sparse_swiglu_ignore_0_1
Base model
mistralai/Mistral-7B-v0.1