TinyStoriesV2_Llama-3.2-1B-cqilj5xm

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3244
  • Model Preparation Time: 0.0017
  • Token Accuracy: 0.5021
  • Token Error Rate: 0.4979
  • Perplexity: inf

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Token Accuracy Token Error Rate Perplexity
No log 0 0 12.1341 0.0017 0.0000 1.0000 inf
3.8329 0.16 100 4.1258 0.0017 0.2893 0.7107 inf
3.4807 0.32 200 3.5020 0.0017 0.3494 0.6506 inf
3.0304 0.48 300 3.1815 0.0017 0.3897 0.6103 inf
2.9981 0.64 400 2.9675 0.0017 0.4173 0.5827 inf
2.8975 0.8 500 2.8119 0.0017 0.4357 0.5643 inf
2.691 0.96 600 2.6998 0.0017 0.4486 0.5514 inf
2.2577 1.12 700 2.6060 0.0017 0.4634 0.5366 inf
2.3614 1.28 800 2.5507 0.0017 0.4697 0.5303 inf
2.2865 1.44 900 2.4960 0.0017 0.4776 0.5224 inf
2.4233 1.6 1000 2.4489 0.0017 0.4835 0.5165 inf
2.1458 1.76 1100 2.3967 0.0017 0.4906 0.5094 inf
2.3283 1.92 1200 2.3582 0.0017 0.4943 0.5057 inf
1.5291 2.08 1300 2.3555 0.0017 0.5009 0.4991 inf
1.5903 2.24 1400 2.3568 0.0017 0.4996 0.5004 inf
1.5963 2.4 1500 2.3527 0.0017 0.5003 0.4997 inf
1.6494 2.56 1600 2.3405 0.0017 0.5015 0.4985 inf
1.3717 2.7200 1700 2.3306 0.0017 0.5023 0.4977 inf
1.619 2.88 1800 2.3244 0.0017 0.5021 0.4979 inf
0.9459 3.04 1900 2.3887 0.0017 0.5036 0.4964 inf
0.9456 3.2 2000 2.4541 0.0017 0.4995 0.5005 inf
0.734 3.36 2100 2.4907 0.0017 0.4967 0.5033 inf
0.8771 3.52 2200 2.5155 0.0017 0.4954 0.5046 inf
0.7737 3.68 2300 2.5191 0.0017 0.4958 0.5042 inf
0.8325 3.84 2400 2.5384 0.0017 0.4938 0.5062 inf

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
2
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/TinyStoriesV2_Llama-3.2-1B-cqilj5xm

Finetuned
(903)
this model