calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.0686	1.0	5	2.3473
2.1497	2.0	10	1.8466
1.7093	3.0	15	1.4770
1.3516	4.0	20	1.1862
1.1256	5.0	25	1.0369
0.9968	6.0	30	0.9148
0.8740	7.0	35	0.7848
0.7769	8.0	40	0.7072
0.7493	9.0	45	0.6786
0.6867	10.0	50	0.6278
0.6252	11.0	55	0.5787
0.5738	12.0	60	0.5314
0.5339	13.0	65	0.4908
0.5060	14.0	70	0.4669
0.4701	15.0	75	0.4442
0.4516	16.0	80	0.4094
0.4198	17.0	85	0.3800
0.3867	18.0	90	0.3529
0.3603	19.0	95	0.3184
0.3273	20.0	100	0.2823
0.2989	21.0	105	0.2565
0.2673	22.0	110	0.2198
0.2490	23.0	115	0.2012
0.2227	24.0	120	0.1754
0.1998	25.0	125	0.1565
0.1802	26.0	130	0.1433
0.1650	27.0	135	0.1301
0.1544	28.0	140	0.1162
0.1430	29.0	145	0.1074
0.1336	30.0	150	0.0990
0.1267	31.0	155	0.0942
0.1185	32.0	160	0.0890
0.1142	33.0	165	0.0854
0.1076	34.0	170	0.0804
0.1030	35.0	175	0.0762
0.1002	36.0	180	0.0765
0.0973	37.0	185	0.0717
0.0969	38.0	190	0.0705
0.0951	39.0	195	0.0697
0.0933	40.0	200	0.0695

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support