xlm-roberta-base

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0754
  • Precision: 0.9493
  • Recall: 0.9493
  • F1: 0.9493
  • Accuracy: 0.9891

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.3
  • num_epochs: 49

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
No log 1.0 10 2.1547 0.0187 0.0455 0.0265 0.5147
No log 2.0 20 1.1487 0.0 0.0 0.0 0.7473
No log 3.0 30 0.8744 0.0 0.0 0.0 0.7588
No log 4.0 40 0.5312 0.3256 0.3427 0.3339 0.8466
No log 5.0 50 0.3368 0.4750 0.4808 0.4778 0.9020
No log 6.0 60 0.1807 0.7834 0.8094 0.7962 0.9524
No log 7.0 70 0.1123 0.8493 0.8969 0.8724 0.9747
No log 8.0 80 0.0740 0.8783 0.9213 0.8993 0.9810
No log 9.0 90 0.0590 0.8971 0.9143 0.9056 0.9861
No log 10.0 100 0.0591 0.9210 0.9371 0.9289 0.9879
No log 11.0 110 0.0517 0.9099 0.9353 0.9224 0.9874
No log 12.0 120 0.0487 0.9201 0.9458 0.9328 0.9895
No log 13.0 130 0.0735 0.8835 0.9283 0.9054 0.9830
No log 14.0 140 0.0632 0.8986 0.9301 0.9141 0.9848
No log 15.0 150 0.0639 0.8764 0.9301 0.9025 0.9832
No log 16.0 160 0.0649 0.9210 0.9371 0.9289 0.9872
No log 17.0 170 0.0627 0.9024 0.9371 0.9194 0.9853
No log 18.0 180 0.0717 0.9177 0.9353 0.9264 0.9871
No log 19.0 190 0.0529 0.9348 0.9528 0.9437 0.9876
No log 20.0 200 0.0547 0.9330 0.9493 0.9411 0.9879
No log 21.0 210 0.0580 0.9426 0.9476 0.9451 0.9890
No log 22.0 220 0.0613 0.9715 0.9528 0.9620 0.9902
No log 23.0 230 0.0586 0.9511 0.9528 0.9520 0.9891
No log 24.0 240 0.0674 0.9543 0.9493 0.9518 0.9892
No log 25.0 250 0.0663 0.9362 0.9493 0.9427 0.9894
No log 26.0 260 0.0638 0.9445 0.9528 0.9487 0.9896
No log 27.0 270 0.0645 0.9482 0.9598 0.9540 0.9907
No log 28.0 280 0.0727 0.9510 0.9510 0.9510 0.9886
No log 29.0 290 0.0756 0.9462 0.9528 0.9495 0.9883
No log 30.0 300 0.0762 0.9378 0.9493 0.9435 0.9884
No log 31.0 310 0.0753 0.9443 0.9476 0.9459 0.9891
No log 32.0 320 0.0755 0.9459 0.9476 0.9467 0.9891
No log 33.0 330 0.0771 0.9443 0.9476 0.9459 0.9892
No log 34.0 340 0.0783 0.9459 0.9476 0.9467 0.9894
No log 35.0 350 0.0783 0.9476 0.9476 0.9476 0.9895
No log 36.0 360 0.0784 0.9476 0.9476 0.9476 0.9896
No log 37.0 370 0.0829 0.9420 0.9371 0.9395 0.9888
No log 38.0 380 0.0816 0.9457 0.9441 0.9449 0.9895
No log 39.0 390 0.0789 0.9443 0.9476 0.9459 0.9895
No log 40.0 400 0.0780 0.9461 0.9510 0.9486 0.9894
No log 41.0 410 0.0832 0.9473 0.9423 0.9448 0.9879
No log 42.0 420 0.0815 0.9457 0.9441 0.9449 0.9888
No log 43.0 430 0.0785 0.9493 0.9493 0.9493 0.9896
No log 44.0 440 0.0758 0.9494 0.9510 0.9502 0.9892
No log 45.0 450 0.0748 0.9460 0.9493 0.9476 0.9890
No log 46.0 460 0.0751 0.9493 0.9493 0.9493 0.9891
No log 47.0 470 0.0755 0.9476 0.9476 0.9476 0.9890
No log 48.0 480 0.0755 0.9493 0.9493 0.9493 0.9891
No log 49.0 490 0.0754 0.9493 0.9493 0.9493 0.9891

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.1.1
  • Tokenizers 0.22.1
Downloads last month
47
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support