deberta-v3-base_smcalflow-classifier_50

This model is a fine-tuned version of microsoft/deberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0846
  • F1 Micro: 0.8924
  • F1 Macro: 0.3815
  • Exact Match: 0.3833

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.1
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss F1 Micro F1 Macro Exact Match
0.2620 1.0 656 0.2255 0.5700 0.0268 0.0
0.0876 2.0 1312 0.1057 0.5700 0.0268 0.0
0.0738 3.0 1968 0.0907 0.6249 0.0397 0.0
0.0548 4.0 2624 0.0759 0.7461 0.0722 0.0042
0.0402 5.0 3280 0.0658 0.7969 0.1136 0.0569
0.0314 6.0 3936 0.0674 0.8131 0.1392 0.0917
0.0274 7.0 4592 0.0663 0.8340 0.1615 0.1194
0.0230 8.0 5248 0.0660 0.8425 0.1865 0.1431
0.0192 9.0 5904 0.0732 0.8420 0.2003 0.1708
0.0181 10.0 6560 0.0667 0.8523 0.2148 0.1986
0.0163 11.0 7216 0.0706 0.8616 0.2200 0.2264
0.0155 12.0 7872 0.0696 0.8657 0.2406 0.2431
0.0138 13.0 8528 0.0779 0.8614 0.2512 0.2583
0.0130 14.0 9184 0.0691 0.8697 0.2571 0.275
0.0112 15.0 9840 0.0693 0.8798 0.2689 0.2944
0.0107 16.0 10496 0.0748 0.8731 0.2831 0.2944
0.0102 17.0 11152 0.0717 0.8786 0.2864 0.3167
0.0091 18.0 11808 0.0707 0.8847 0.2928 0.3139
0.0089 19.0 12464 0.0722 0.8814 0.2916 0.3278
0.0085 20.0 13120 0.0765 0.8804 0.2954 0.3306
0.0081 21.0 13776 0.0700 0.8868 0.3119 0.3417
0.0069 22.0 14432 0.0725 0.8833 0.3007 0.3375
0.0073 23.0 15088 0.0751 0.8829 0.3250 0.3528
0.0069 24.0 15744 0.0769 0.8828 0.3200 0.35
0.0060 25.0 16400 0.0742 0.8852 0.3234 0.3514
0.0064 26.0 17056 0.0742 0.8870 0.3264 0.3597
0.0057 27.0 17712 0.0772 0.8895 0.3370 0.3639
0.0055 28.0 18368 0.0761 0.8882 0.3374 0.3639
0.0054 29.0 19024 0.0808 0.8848 0.3358 0.3681
0.0048 30.0 19680 0.0824 0.8857 0.3470 0.3694
0.0047 31.0 20336 0.0832 0.8836 0.3453 0.3708
0.0049 32.0 20992 0.0851 0.8864 0.3410 0.3722
0.0043 33.0 21648 0.0841 0.8864 0.3611 0.3736
0.0049 34.0 22304 0.0832 0.8871 0.3574 0.3708
0.0041 35.0 22960 0.0791 0.8871 0.3658 0.3778
0.0038 36.0 23616 0.0835 0.8841 0.3623 0.3722
0.0037 37.0 24272 0.0877 0.8900 0.3622 0.3819
0.0038 38.0 24928 0.0839 0.8916 0.3636 0.3764
0.0035 39.0 25584 0.0843 0.8902 0.3779 0.3764
0.0036 40.0 26240 0.0833 0.8890 0.3694 0.375
0.0031 41.0 26896 0.0837 0.8891 0.3751 0.3833
0.0030 42.0 27552 0.0870 0.8888 0.3839 0.3833
0.0030 43.0 28208 0.0860 0.8890 0.3792 0.3764
0.0030 44.0 28864 0.0844 0.8917 0.3809 0.3875
0.0030 45.0 29520 0.0846 0.8919 0.3811 0.3806
0.0030 46.0 30176 0.0851 0.8908 0.3749 0.3819
0.0029 47.0 30832 0.0857 0.8912 0.3774 0.3819
0.0028 48.0 31488 0.0859 0.8911 0.3783 0.3792
0.0026 49.0 32144 0.0856 0.8909 0.3803 0.3806
0.0025 50.0 32800 0.0846 0.8924 0.3815 0.3833

Framework versions

  • Transformers 5.2.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dv347/deberta-v3-base_smcalflow-classifier_50

Finetuned
(587)
this model