vit-base-patch16-224-celeba-smiling

This model is a fine-tuned version of google/vit-base-patch16-224 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 128
eval_batch_size: 64
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 0.1
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.1881	0.3931	500	0.1765	0.9284
0.1848	0.7862	1000	0.1680	0.9285
0.1738	1.1792	1500	0.1732	0.9266
0.1729	1.5723	2000	0.1605	0.9338
0.1619	1.9654	2500	0.1623	0.9283
0.1574	2.3585	3000	0.1589	0.9334
0.1500	2.7516	3500	0.1635	0.9340
0.1338	3.1447	4000	0.1616	0.9334
0.1350	3.5377	4500	0.1910	0.9228
0.1370	3.9308	5000	0.1664	0.9280
0.1093	4.3239	5500	0.1803	0.9321
0.1076	4.7170	6000	0.1908	0.9312
0.0677	5.1101	6500	0.2124	0.9309
0.0732	5.5031	7000	0.2236	0.9263
0.0666	5.8962	7500	0.2175	0.9280
0.0430	6.2893	8000	0.2474	0.9268
0.0393	6.6824	8500	0.2578	0.9256
0.0266	7.0755	9000	0.2742	0.9280
0.0243	7.4686	9500	0.2939	0.9290
0.0234	7.8616	10000	0.2994	0.9282
0.0140	8.2547	10500	0.3040	0.9287
0.0137	8.6478	11000	0.3067	0.9288
0.0144	9.0409	11500	0.3179	0.9294
0.0128	9.4340	12000	0.3179	0.9292
0.0110	9.8270	12500	0.3184	0.9290