indot5-bloom-specialized-v2
This model is a fine-tuned version of hawalurahman/idt5-base-qaqg-v1.12-SQuAD-id on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 3.3419
- Rouge1: 46.6077
- Rouge2: 29.0152
- Rougel: 45.797
- Bleu: 21.0863
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 15
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1
Training results
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Bleu |
|---|---|---|---|---|---|---|---|
| No log | 1.0 | 132 | 8.4473 | 20.9903 | 4.9013 | 20.4194 | 1.8331 |
| No log | 2.0 | 264 | 5.3636 | 29.8455 | 17.232 | 29.4428 | 9.3255 |
| No log | 3.0 | 396 | 4.3023 | 23.4656 | 14.3004 | 23.2456 | 4.0109 |
| 14.5449 | 4.0 | 528 | 3.9479 | 29.5155 | 18.9845 | 29.1627 | 12.8862 |
| 14.5449 | 5.0 | 660 | 3.7958 | 34.1475 | 20.532 | 33.4502 | 13.494 |
| 14.5449 | 6.0 | 792 | 3.6605 | 38.7089 | 23.7666 | 37.878 | 14.02 |
| 14.5449 | 7.0 | 924 | 3.5575 | 40.4795 | 24.2275 | 39.5735 | 17.7935 |
| 8.5044 | 8.0 | 1056 | 3.4889 | 41.9855 | 25.0628 | 40.9038 | 18.7032 |
| 8.5044 | 9.0 | 1188 | 3.4552 | 43.7492 | 26.1005 | 42.5527 | 19.4792 |
| 8.5044 | 10.0 | 1320 | 3.4128 | 44.3831 | 27.2557 | 43.3474 | 20.1699 |
| 8.5044 | 11.0 | 1452 | 3.3807 | 44.6715 | 27.0346 | 43.6125 | 20.3167 |
| 7.5391 | 12.0 | 1584 | 3.3643 | 45.3 | 27.9369 | 44.5341 | 20.5851 |
| 7.5391 | 13.0 | 1716 | 3.3506 | 45.705 | 28.2265 | 44.9036 | 20.6195 |
| 7.5391 | 14.0 | 1848 | 3.3440 | 46.5623 | 29.0465 | 45.7099 | 21.229 |
| 7.5391 | 15.0 | 1980 | 3.3419 | 46.6077 | 29.0152 | 45.797 | 21.0863 |
Framework versions
- Transformers 5.0.0
- Pytorch 2.10.0+cu128
- Datasets 4.8.3
- Tokenizers 0.22.2
π Quantitative Benchmarking (Evaluation Results)
The model was evaluated using a test set of 100 samples (50 for C1 and 50 for C2) from the Indo-Bloom Corpus.
| Metric | Score | Baseline (Awalurahman et al., 2024) | Status |
|---|---|---|---|
| BLEU-1 | 29.01 | 41.02 | - |
| BLEU-4 | 15.0 | 14.25 | SOTA π |
| ROUGE-L | 40.99 | 54.39 | Competitive |
Notes: - The score of 15.0 in BLEU-4 outperforms the previous national SOTA record of 14.25.
- This result confirms that pedagogical control (Bloom Taxonomy) improves the structural coherence of generated questions.
- Downloads last month
- 369
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for Firmansyah-Ibrahim/indot5-bloom-specialized-v2
Base model
muchad/idt5-base