deberta-v3-base_smcalflow-classifier_50

This model is a fine-tuned version of microsoft/deberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0846
F1 Micro: 0.8924
F1 Macro: 0.3815
Exact Match: 0.3833

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 0.1
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Micro	F1 Macro	Exact Match
0.2620	1.0	656	0.2255	0.5700	0.0268	0.0
0.0876	2.0	1312	0.1057	0.5700	0.0268	0.0
0.0738	3.0	1968	0.0907	0.6249	0.0397	0.0
0.0548	4.0	2624	0.0759	0.7461	0.0722	0.0042
0.0402	5.0	3280	0.0658	0.7969	0.1136	0.0569
0.0314	6.0	3936	0.0674	0.8131	0.1392	0.0917
0.0274	7.0	4592	0.0663	0.8340	0.1615	0.1194
0.0230	8.0	5248	0.0660	0.8425	0.1865	0.1431
0.0192	9.0	5904	0.0732	0.8420	0.2003	0.1708
0.0181	10.0	6560	0.0667	0.8523	0.2148	0.1986
0.0163	11.0	7216	0.0706	0.8616	0.2200	0.2264
0.0155	12.0	7872	0.0696	0.8657	0.2406	0.2431
0.0138	13.0	8528	0.0779	0.8614	0.2512	0.2583
0.0130	14.0	9184	0.0691	0.8697	0.2571	0.275
0.0112	15.0	9840	0.0693	0.8798	0.2689	0.2944
0.0107	16.0	10496	0.0748	0.8731	0.2831	0.2944
0.0102	17.0	11152	0.0717	0.8786	0.2864	0.3167
0.0091	18.0	11808	0.0707	0.8847	0.2928	0.3139
0.0089	19.0	12464	0.0722	0.8814	0.2916	0.3278
0.0085	20.0	13120	0.0765	0.8804	0.2954	0.3306
0.0081	21.0	13776	0.0700	0.8868	0.3119	0.3417
0.0069	22.0	14432	0.0725	0.8833	0.3007	0.3375
0.0073	23.0	15088	0.0751	0.8829	0.3250	0.3528
0.0069	24.0	15744	0.0769	0.8828	0.3200	0.35
0.0060	25.0	16400	0.0742	0.8852	0.3234	0.3514
0.0064	26.0	17056	0.0742	0.8870	0.3264	0.3597
0.0057	27.0	17712	0.0772	0.8895	0.3370	0.3639
0.0055	28.0	18368	0.0761	0.8882	0.3374	0.3639
0.0054	29.0	19024	0.0808	0.8848	0.3358	0.3681
0.0048	30.0	19680	0.0824	0.8857	0.3470	0.3694
0.0047	31.0	20336	0.0832	0.8836	0.3453	0.3708
0.0049	32.0	20992	0.0851	0.8864	0.3410	0.3722
0.0043	33.0	21648	0.0841	0.8864	0.3611	0.3736
0.0049	34.0	22304	0.0832	0.8871	0.3574	0.3708
0.0041	35.0	22960	0.0791	0.8871	0.3658	0.3778
0.0038	36.0	23616	0.0835	0.8841	0.3623	0.3722
0.0037	37.0	24272	0.0877	0.8900	0.3622	0.3819
0.0038	38.0	24928	0.0839	0.8916	0.3636	0.3764
0.0035	39.0	25584	0.0843	0.8902	0.3779	0.3764
0.0036	40.0	26240	0.0833	0.8890	0.3694	0.375
0.0031	41.0	26896	0.0837	0.8891	0.3751	0.3833
0.0030	42.0	27552	0.0870	0.8888	0.3839	0.3833
0.0030	43.0	28208	0.0860	0.8890	0.3792	0.3764
0.0030	44.0	28864	0.0844	0.8917	0.3809	0.3875
0.0030	45.0	29520	0.0846	0.8919	0.3811	0.3806
0.0030	46.0	30176	0.0851	0.8908	0.3749	0.3819
0.0029	47.0	30832	0.0857	0.8912	0.3774	0.3819
0.0028	48.0	31488	0.0859	0.8911	0.3783	0.3792
0.0026	49.0	32144	0.0856	0.8909	0.3803	0.3806
0.0025	50.0	32800	0.0846	0.8924	0.3815	0.3833

Framework versions

Transformers 5.2.0
Pytorch 2.10.0+cu128
Datasets 4.5.0
Tokenizers 0.22.2

Downloads last month: 1

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for dv347/deberta-v3-base_smcalflow-classifier_50

Base model

microsoft/deberta-v3-base

Finetuned

(587)

this model