indot5-bloom-specialized-v2

This model is a fine-tuned version of hawalurahman/idt5-base-qaqg-v1.12-SQuAD-id on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3419
  • Rouge1: 46.6077
  • Rouge2: 29.0152
  • Rougel: 45.797
  • Bleu: 21.0863

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Bleu
No log 1.0 132 8.4473 20.9903 4.9013 20.4194 1.8331
No log 2.0 264 5.3636 29.8455 17.232 29.4428 9.3255
No log 3.0 396 4.3023 23.4656 14.3004 23.2456 4.0109
14.5449 4.0 528 3.9479 29.5155 18.9845 29.1627 12.8862
14.5449 5.0 660 3.7958 34.1475 20.532 33.4502 13.494
14.5449 6.0 792 3.6605 38.7089 23.7666 37.878 14.02
14.5449 7.0 924 3.5575 40.4795 24.2275 39.5735 17.7935
8.5044 8.0 1056 3.4889 41.9855 25.0628 40.9038 18.7032
8.5044 9.0 1188 3.4552 43.7492 26.1005 42.5527 19.4792
8.5044 10.0 1320 3.4128 44.3831 27.2557 43.3474 20.1699
8.5044 11.0 1452 3.3807 44.6715 27.0346 43.6125 20.3167
7.5391 12.0 1584 3.3643 45.3 27.9369 44.5341 20.5851
7.5391 13.0 1716 3.3506 45.705 28.2265 44.9036 20.6195
7.5391 14.0 1848 3.3440 46.5623 29.0465 45.7099 21.229
7.5391 15.0 1980 3.3419 46.6077 29.0152 45.797 21.0863

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.8.3
  • Tokenizers 0.22.2

πŸ“Š Quantitative Benchmarking (Evaluation Results)

The model was evaluated using a test set of 100 samples (50 for C1 and 50 for C2) from the Indo-Bloom Corpus.

Metric Score Baseline (Awalurahman et al., 2024) Status
BLEU-1 29.01 41.02 -
BLEU-4 15.0 14.25 SOTA πŸš€
ROUGE-L 40.99 54.39 Competitive

Notes: - The score of 15.0 in BLEU-4 outperforms the previous national SOTA record of 14.25.

  • This result confirms that pedagogical control (Bloom Taxonomy) improves the structural coherence of generated questions.
Downloads last month
369
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Firmansyah-Ibrahim/indot5-bloom-specialized-v2

Base model

muchad/idt5-base
Finetuned
(2)
this model