diff --git "a/training.log" "b/training.log" new file mode 100644--- /dev/null +++ "b/training.log" @@ -0,0 +1,1847 @@ +(venv) leo@leo-mint:~/smallm/Supra-50M$ python3 train.py +[*] Loading libraries... +[*] Loading tokenizer... +[*] Preparing 20,000,000,000 tokens (streaming, memmap-backed)... +[=] Reusing existing token file: tokens.bin +[+] Dataset ready: 19,531,250 chunks of 1024 tokens +[*] Setting up model... +[*] Model parameters: 51,786,240 +[*] Defining training arguments... +[transformers] warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead. +[*] Starting training... + 0%| | 0/152588 [00:00