ossbert-onc-unlab-bs256-10epochs
This model is a fine-tuned version of ania3000/untrained-ossbert-e on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.3610
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 6.0886 | 0.4513 | 250 | 5.3485 |
| 5.1723 | 0.9025 | 500 | 4.8374 |
| 4.776 | 1.3538 | 750 | 4.5325 |
| 4.533 | 1.8051 | 1000 | 4.3090 |
| 4.3356 | 2.2563 | 1250 | 4.1457 |
| 4.1966 | 2.7076 | 1500 | 4.0137 |
| 4.0777 | 3.1588 | 1750 | 3.8991 |
| 3.977 | 3.6101 | 2000 | 3.8095 |
| 3.8958 | 4.0614 | 2250 | 3.7400 |
| 3.8221 | 4.5126 | 2500 | 3.6869 |
| 3.7609 | 4.9639 | 2750 | 3.6213 |
| 3.7108 | 5.4152 | 3000 | 3.5751 |
| 3.6617 | 5.8664 | 3250 | 3.5395 |
| 3.6238 | 6.3177 | 3500 | 3.5071 |
| 3.5912 | 6.7690 | 3750 | 3.4713 |
| 3.5607 | 7.2202 | 4000 | 3.4429 |
| 3.54 | 7.6715 | 4250 | 3.4145 |
| 3.5138 | 8.1227 | 4500 | 3.4078 |
| 3.4864 | 8.5740 | 4750 | 3.3908 |
| 3.4848 | 9.0253 | 5000 | 3.3768 |
| 3.4597 | 9.4765 | 5250 | 3.3632 |
| 3.456 | 9.9278 | 5500 | 3.3610 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.9.1+cu128
- Datasets 4.0.0
- Tokenizers 0.22.1
- Downloads last month
- 1
Model tree for AlexeySorokin/ossbert-onc-unlab-bs256-10epochs
Base model
ania3000/untrained-ossbert-e