gpt2-multilingual-20-arabic-repair_3epochs

This model is a fine-tuned version of CausalNLP/gpt2-multilingual-20-arabic-repair_3epochs on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.2618

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
3.2301 0.0799 500 3.2896
3.226 0.1597 1000 3.2922
3.2259 0.2396 1500 3.2917
3.2351 0.3194 2000 3.2903
3.2705 0.3993 2500 3.2887
3.2262 0.4791 3000 3.2870
3.1945 0.5590 3500 3.2856
3.2247 0.6388 4000 3.2839
3.2335 0.7187 4500 3.2825
3.2268 0.7985 5000 3.2805
3.2022 0.8784 5500 3.2789
3.2204 0.9582 6000 3.2767
3.2528 1.0380 6500 3.2762
3.2054 1.1179 7000 3.2746
3.2314 1.1977 7500 3.2733
3.2201 1.2776 8000 3.2719
3.2078 1.3574 8500 3.2712
3.2028 1.4373 9000 3.2694
3.2164 1.5171 9500 3.2678
3.1794 1.5970 10000 3.2670
3.2168 1.6768 10500 3.2659
3.2993 1.7567 11000 3.2651
3.2207 1.8365 11500 3.2641
3.1918 1.9164 12000 3.2634
3.2324 1.9962 12500 3.2628
3.2103 2.0760 13000 3.2629
3.1576 2.1559 13500 3.2626
3.2315 2.2357 14000 3.2623
3.1858 2.3156 14500 3.2622
3.1828 2.3954 15000 3.2621
3.2102 2.4753 15500 3.2619
3.2049 2.5551 16000 3.2619
3.1641 2.6350 16500 3.2618
3.2541 2.7148 17000 3.2618
3.1553 2.7947 17500 3.2618
3.242 2.8746 18000 3.2618
3.1727 2.9544 18500 3.2618

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.4.2
  • Tokenizers 0.22.2
Downloads last month
38
Safetensors
Model size
0.2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CausalNLP/gpt2-multilingual-20-arabic-repair_3epochs

Unable to build the model tree, the base model loops to the model itself. Learn more.