You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

dense_eng_hom_100m_mult_reseg_ep20_goldfish

This model is a fine-tuned version of on the arrow dataset. It achieves the following results on the evaluation set:

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-06 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1352
training_steps: 13525
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
7.1732	0.7394	500	6.2638
5.6194	1.4776	1000	5.4843
5.227	2.2159	1500	5.0583
4.841	2.9553	2000	4.7868
4.6153	3.6935	2500	4.6040
4.3649	4.4318	3000	4.4778
4.3025	5.1701	3500	4.3898
4.1416	5.9094	4000	4.3192
3.9741	6.6477	4500	4.2908
3.8149	7.3860	5000	4.2862
3.8348	8.1242	5500	4.2873
3.6977	8.8636	6000	4.2723
3.5328	9.6018	6500	4.3113
3.3815	10.3401	7000	4.3560
3.4349	11.0784	7500	4.3939
3.307	11.8177	8000	4.4125
3.1486	12.5560	8500	4.4781
3.0233	13.2943	9000	4.5435
3.0794	14.0325	9500	4.5834
2.9665	14.7719	10000	4.6196
2.8357	15.5102	10500	4.6898
2.7675	16.2484	11000	4.7474
2.7861	16.9878	11500	4.7618
2.7079	17.7261	12000	4.8156
2.6257	18.4643	12500	4.8569
2.6031	19.2026	13000	4.8812
2.5892	19.9420	13500	4.8869

Safetensors

Model size

0.2B params

Tensor type

F32