ssc-bas-mms-model-mix-adapt-max-lowlr

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 6
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
0.4007	0.8457	200	0.1491	0.1468	0.4598
0.2849	1.6892	400	0.1274	0.1415	0.4395
0.278	2.5328	600	0.1249	0.1424	0.4395
0.2546	3.3763	800	0.1205	0.1401	0.4344
0.2516	4.2199	1000	0.1211	0.1421	0.4386
0.2338	5.0634	1200	0.1153	0.1394	0.4341
0.2492	5.9091	1400	0.1135	0.1386	0.4313
0.2269	6.7526	1600	0.1150	0.1397	0.4335
0.2033	7.5962	1800	0.1130	0.1386	0.4262
0.2114	8.4397	2000	0.1140	0.1393	0.4274
0.2022	9.2833	2200	0.1091	0.1378	0.4235
0.1901	10.1268	2400	0.1111	0.1369	0.4201
0.1983	10.9725	2600	0.1112	0.1377	0.4211
0.1829	11.8161	2800	0.1093	0.1377	0.4229
0.1831	12.6596	3000	0.1100	0.1365	0.4183
0.1726	13.5032	3200	0.1110	0.1367	0.4174
0.1721	14.3467	3400	0.1127	0.1371	0.4198
0.1686	15.1903	3600	0.1136	0.1361	0.4150
0.1768	16.0338	3800	0.1137	0.1374	0.4201
0.1573	16.8795	4000	0.1112	0.1363	0.4153
0.1631	17.7230	4200	0.1094	0.1355	0.4126
0.1507	18.5666	4400	0.1129	0.1363	0.4186
0.1537	19.4101	4600	0.1115	0.1369	0.4177
0.1579	20.2537	4800	0.1126	0.1360	0.4141
0.143	21.0973	5000	0.1128	0.1371	0.4189
0.1446	21.9429	5200	0.1147	0.1369	0.4189
0.1369	22.7865	5400	0.1141	0.1378	0.4229
0.1438	23.6300	5600	0.1131	0.1363	0.4168
0.1402	24.4736	5800	0.1159	0.1369	0.4192
0.1328	25.3171	6000	0.1143	0.1357	0.4159
0.1318	26.1607	6200	0.1149	0.1356	0.4150
0.1409	27.0042	6400	0.1148	0.1359	0.4165
0.1321	27.8499	6600	0.1150	0.1359	0.4150
0.1295	28.6934	6800	0.1148	0.1362	0.4174
0.1311	29.5370	7000	0.1145	0.1361	0.4189

Safetensors

Model size

1.0B params

Tensor type

F32

Base model

Finetuned

(382)

this model