ossbert-onc-unlab-bs256-10epochs

This model is a fine-tuned version of ania3000/untrained-ossbert-e on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 256
eval_batch_size: 256
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training Loss	Epoch	Step	Validation Loss
6.0886	0.4513	250	5.3485
5.1723	0.9025	500	4.8374
4.776	1.3538	750	4.5325
4.533	1.8051	1000	4.3090
4.3356	2.2563	1250	4.1457
4.1966	2.7076	1500	4.0137
4.0777	3.1588	1750	3.8991
3.977	3.6101	2000	3.8095
3.8958	4.0614	2250	3.7400
3.8221	4.5126	2500	3.6869
3.7609	4.9639	2750	3.6213
3.7108	5.4152	3000	3.5751
3.6617	5.8664	3250	3.5395
3.6238	6.3177	3500	3.5071
3.5912	6.7690	3750	3.4713
3.5607	7.2202	4000	3.4429
3.54	7.6715	4250	3.4145
3.5138	8.1227	4500	3.4078
3.4864	8.5740	4750	3.3908
3.4848	9.0253	5000	3.3768
3.4597	9.4765	5250	3.3632
3.456	9.9278	5500	3.3610

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

(3)

this model