(venv) leo@leo-mint:~/smallm/Supra-50M$ python3 train.py [*] Loading libraries... [*] Loading tokenizer... [*] Preparing 20,000,000,000 tokens (streaming, memmap-backed)... [=] Reusing existing token file: tokens.bin [+] Dataset ready: 19,531,250 chunks of 1024 tokens [*] Setting up model... [*] Model parameters: 51,786,240 [*] Defining training arguments... [transformers] warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead. [*] Starting training... 0%| | 0/152588 [00:00