calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4736	1.0	5	2.8944
2.5336	2.0	10	2.0730
1.9570	3.0	15	1.7682
1.7493	4.0	20	1.6709
1.6370	5.0	25	1.5898
1.5553	6.0	30	1.5281
1.4848	7.0	35	1.4670
1.4499	8.0	40	1.4851
1.4249	9.0	45	1.3972
1.3834	10.0	50	1.3704
1.4422	11.0	55	1.3943
1.3136	12.0	60	1.2726
1.2540	13.0	65	1.2549
1.1964	14.0	70	1.2056
1.1786	15.0	75	1.1397
1.1210	16.0	80	1.1439
1.0828	17.0	85	1.0829
1.0435	18.0	90	1.0128
1.0068	19.0	95	0.9808
0.9622	20.0	100	0.9519
0.9592	21.0	105	0.9508
0.9224	22.0	110	0.9093
0.8814	23.0	115	0.8723
0.8571	24.0	120	0.8600
0.8514	25.0	125	0.8251
0.8304	26.0	130	0.8099
0.8099	27.0	135	0.7852
0.7926	28.0	140	0.7715
0.7839	29.0	145	0.7859
0.7683	30.0	150	0.7560
0.7481	31.0	155	0.7389
0.7361	32.0	160	0.7216
0.7343	33.0	165	0.7362
0.7220	34.0	170	0.7088
0.7105	35.0	175	0.7068
0.7003	36.0	180	0.6933
0.6920	37.0	185	0.6849
0.6885	38.0	190	0.6823
0.6859	39.0	195	0.6745
0.6811	40.0	200	0.6737

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support