gpt2-multilingual-20-arabic-repair_3epochs

This model is a fine-tuned version of CausalNLP/gpt2-multilingual-20-arabic-repair_3epochs on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use adamw_torch_fused with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 500
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss
3.2301	0.0799	500	3.2896
3.226	0.1597	1000	3.2922
3.2259	0.2396	1500	3.2917
3.2351	0.3194	2000	3.2903
3.2705	0.3993	2500	3.2887
3.2262	0.4791	3000	3.2870
3.1945	0.5590	3500	3.2856
3.2247	0.6388	4000	3.2839
3.2335	0.7187	4500	3.2825
3.2268	0.7985	5000	3.2805
3.2022	0.8784	5500	3.2789
3.2204	0.9582	6000	3.2767
3.2528	1.0380	6500	3.2762
3.2054	1.1179	7000	3.2746
3.2314	1.1977	7500	3.2733
3.2201	1.2776	8000	3.2719
3.2078	1.3574	8500	3.2712
3.2028	1.4373	9000	3.2694
3.2164	1.5171	9500	3.2678
3.1794	1.5970	10000	3.2670
3.2168	1.6768	10500	3.2659
3.2993	1.7567	11000	3.2651
3.2207	1.8365	11500	3.2641
3.1918	1.9164	12000	3.2634
3.2324	1.9962	12500	3.2628
3.2103	2.0760	13000	3.2629
3.1576	2.1559	13500	3.2626
3.2315	2.2357	14000	3.2623
3.1858	2.3156	14500	3.2622
3.1828	2.3954	15000	3.2621
3.2102	2.4753	15500	3.2619
3.2049	2.5551	16000	3.2619
3.1641	2.6350	16500	3.2618
3.2541	2.7148	17000	3.2618
3.1553	2.7947	17500	3.2618
3.242	2.8746	18000	3.2618
3.1727	2.9544	18500	3.2618

Safetensors

Model size

0.2B params

Tensor type

BF16

Unable to build the model tree, the base model loops to the model itself. Learn more.