TinyStoriesV2_Llama-3.2-1B-q38bgt8z

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3228
  • Model Preparation Time: 0.0027
  • Token Accuracy: 0.5027
  • Token Error Rate: 0.4973
  • Perplexity: inf

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Token Accuracy Token Error Rate Perplexity
No log 0 0 12.1341 0.0027 0.0000 1.0000 inf
3.8313 0.16 100 4.1267 0.0027 0.2895 0.7105 inf
3.4817 0.32 200 3.5024 0.0027 0.3494 0.6506 inf
3.0293 0.48 300 3.1821 0.0027 0.3893 0.6107 inf
2.9999 0.64 400 2.9671 0.0027 0.4170 0.5830 inf
2.8975 0.8 500 2.8111 0.0027 0.4360 0.5640 inf
2.6878 0.96 600 2.7002 0.0027 0.4485 0.5515 inf
2.2631 1.12 700 2.6067 0.0027 0.4634 0.5366 inf
2.3587 1.28 800 2.5508 0.0027 0.4698 0.5302 inf
2.2857 1.44 900 2.4960 0.0027 0.4779 0.5221 inf
2.4221 1.6 1000 2.4482 0.0027 0.4835 0.5165 inf
2.1476 1.76 1100 2.3961 0.0027 0.4905 0.5095 inf
2.3249 1.92 1200 2.3586 0.0027 0.4943 0.5057 inf
1.5277 2.08 1300 2.3577 0.0027 0.5012 0.4988 inf
1.5914 2.24 1400 2.3577 0.0027 0.5001 0.4999 inf
1.5934 2.4 1500 2.3522 0.0027 0.5002 0.4998 inf
1.6612 2.56 1600 2.3411 0.0027 0.5013 0.4987 inf
1.3713 2.7200 1700 2.3314 0.0027 0.5028 0.4972 inf
1.6183 2.88 1800 2.3228 0.0027 0.5027 0.4973 inf
0.9434 3.04 1900 2.3908 0.0027 0.5025 0.4975 inf
0.9399 3.2 2000 2.4527 0.0027 0.5003 0.4997 inf
0.7391 3.36 2100 2.4972 0.0027 0.4971 0.5029 inf
0.8745 3.52 2200 2.5165 0.0027 0.4957 0.5043 inf
0.7738 3.68 2300 2.5191 0.0027 0.4958 0.5042 inf
0.8281 3.84 2400 2.5401 0.0027 0.4942 0.5058 inf

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/TinyStoriesV2_Llama-3.2-1B-q38bgt8z

Finetuned
(904)
this model