roberta-base-anion.train.no.negation.true.irrelevant1e-06-64

This model is a fine-tuned version of roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 256
eval_batch_size: 1024
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss
0.6734	1.0	358	0.5359
0.526	2.0	716	0.4950
0.4998	3.0	1074	0.4880
0.4835	4.0	1432	0.4690
0.4765	5.0	1790	0.4582
0.4641	6.0	2148	0.4507
0.455	7.0	2506	0.4445
0.4476	8.0	2864	0.4376
0.4428	9.0	3222	0.4328
0.4392	10.0	3580	0.4307
0.4363	11.0	3938	0.4264
0.4289	12.0	4296	0.4234
0.4284	13.0	4654	0.4237
0.4246	14.0	5012	0.4221
0.4221	15.0	5370	0.4190
0.418	16.0	5728	0.4177
0.4188	17.0	6086	0.4169
0.4169	18.0	6444	0.4149
0.4146	19.0	6802	0.4145
0.4157	20.0	7160	0.4136
0.4124	21.0	7518	0.4138
0.4096	22.0	7876	0.4129
0.4098	23.0	8234	0.4128
0.4099	24.0	8592	0.4121
0.4058	25.0	8950	0.4119
0.4074	26.0	9308	0.4118
0.4071	27.0	9666	0.4110
0.4085	28.0	10024	0.4107
0.4052	29.0	10382	0.4104
0.4063	30.0	10740	0.4105

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model

Finetunes