TinyStoriesV2_Llama-3.2-1B-7whtiyy8

This model is a fine-tuned version of meta-llama/Llama-3.2-1B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3259
  • Model Preparation Time: 0.006
  • Token Accuracy: 0.5029
  • Token Error Rate: 0.4971
  • Perplexity: inf

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.001
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Model Preparation Time Token Accuracy Token Error Rate Perplexity
No log 0 0 12.1341 0.006 0.0000 1.0000 inf
3.8331 0.16 100 4.1262 0.006 0.2894 0.7106 inf
3.4808 0.32 200 3.5019 0.006 0.3497 0.6503 inf
3.0294 0.48 300 3.1814 0.006 0.3898 0.6102 inf
2.9986 0.64 400 2.9675 0.006 0.4171 0.5829 inf
2.898 0.8 500 2.8123 0.006 0.4359 0.5641 inf
2.6878 0.96 600 2.6999 0.006 0.4483 0.5517 inf
2.2593 1.12 700 2.6065 0.006 0.4630 0.5370 inf
2.362 1.28 800 2.5501 0.006 0.4697 0.5303 inf
2.2866 1.44 900 2.4962 0.006 0.4781 0.5219 inf
2.4233 1.6 1000 2.4508 0.006 0.4830 0.5170 inf
2.1492 1.76 1100 2.3965 0.006 0.4908 0.5092 inf
2.3272 1.92 1200 2.3582 0.006 0.4940 0.5060 inf
1.5257 2.08 1300 2.3568 0.006 0.5007 0.4993 inf
1.5882 2.24 1400 2.3598 0.006 0.4999 0.5001 inf
1.5947 2.4 1500 2.3523 0.006 0.5006 0.4994 inf
1.6552 2.56 1600 2.3391 0.006 0.5016 0.4984 inf
1.3803 2.7200 1700 2.3301 0.006 0.5025 0.4975 inf
1.6149 2.88 1800 2.3259 0.006 0.5029 0.4971 inf
0.9457 3.04 1900 2.3909 0.006 0.5030 0.4970 inf
0.9481 3.2 2000 2.4530 0.006 0.4993 0.5007 inf
0.7384 3.36 2100 2.4962 0.006 0.4976 0.5024 inf
0.8767 3.52 2200 2.5152 0.006 0.4955 0.5045 inf
0.7595 3.68 2300 2.5172 0.006 0.4961 0.5039 inf
0.8285 3.84 2400 2.5394 0.006 0.4944 0.5056 inf

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for donoway/TinyStoriesV2_Llama-3.2-1B-7whtiyy8

Finetuned
(903)
this model
Quantizations
1 model