trainer_output

This model is a fine-tuned version of AlexeySorokin/ossbert-onc-unlab-from_multilingual-bs64-5epochs on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 20

Training Loss	Epoch	Step	Validation Loss	Accuracy	Sentence accuracy
1.078	0.9158	500	0.4205	90.6601	40.9174
0.3523	1.8315	1000	0.3095	93.0652	49.3578
0.2309	2.7473	1500	0.2662	93.8669	53.2110
0.1617	3.6630	2000	0.2546	94.4548	53.9450
0.1228	4.5788	2500	0.2582	94.6686	55.4128
0.0914	5.4945	3000	0.2701	94.7622	55.5963
0.0714	6.4103	3500	0.2675	94.8290	57.2477
0.0538	7.3260	4000	0.2810	94.9893	58.1651
0.0381	8.2418	4500	0.2897	94.8824	56.5138
0.0306	9.1575	5000	0.3038	94.9626	56.6972
0.0223	10.0733	5500	0.2998	95.4570	59.6330
0.0164	10.9890	6000	0.2990	95.1897	57.7982
0.0116	11.9048	6500	0.3082	95.4035	59.6330
0.0089	12.8205	7000	0.3087	95.5505	60.5505
0.0071	13.7363	7500	0.3134	95.4035	59.4495
0.0058	14.6520	8000	0.3157	95.4570	60.0
0.0046	15.5678	8500	0.3311	95.5104	61.1009
0.0038	16.4835	9000	0.3332	95.3634	59.2661
0.0027	17.3993	9500	0.3347	95.4303	59.6330

Safetensors

Model size

0.2B params

Tensor type

F32

Base model

Finetuned

Finetuned

(2)

this model