[*] Loading libraries... [*] Loading tokenizer... [*] Preparing 5,000,000,000 tokens (streaming, memmap-backed)... [=] Reusing existing token file: ./tokens.bin [+] Dataset ready: 4,882,812 chunks of 1024 tokens [*] Setting up model... [*] Model parameters: 7,867,584 [*] Defining training arguments... [transformers] warmup_ratio is deprecated and will be removed in v5.2. Use `warmup_steps` instead. [*] Starting training... 0%| | 0/9538 [00:00