world-model / train.log
ojaffe's picture
Upload folder using huggingface_hub
99c8044 verified
[14:35:51] Device: cuda
[14:35:51]
=== Training pong ([32, 64, 128]) ===
[14:35:51] 2,018,278 parameters
[14:35:51] Phase 1: 10 epochs single-step
[14:35:51] 8568 sequences
[14:36:00] P1 pong Epoch 1/10 | loss=0.14558
[14:36:08] P1 pong Epoch 2/10 | loss=0.10721
[14:36:17] P1 pong Epoch 3/10 | loss=0.09795
[14:36:25] P1 pong Epoch 4/10 | loss=0.08996
[14:36:33] P1 pong Epoch 5/10 | loss=0.08384
[14:36:41] P1 pong Epoch 6/10 | loss=0.07755
[14:36:49] P1 pong Epoch 7/10 | loss=0.06995
[14:36:57] P1 pong Epoch 8/10 | loss=0.06272
[14:37:05] P1 pong Epoch 9/10 | loss=0.05640
[14:37:13] P1 pong Epoch 10/10 | loss=0.05177
[14:37:13] Phase 2: 25 epochs graduated AR
[14:37:37] P2 pong Epoch 1/25 (steps=2) | loss=0.09787 lr=0.000500
[14:37:59] P2 pong Epoch 2/25 (steps=2) | loss=0.08854 lr=0.000500
[14:38:21] P2 pong Epoch 3/25 (steps=2) | loss=0.08343 lr=0.000500
[14:39:15] P2 pong Epoch 4/25 (steps=4) | loss=0.13928 lr=0.000500
[14:40:08] P2 pong Epoch 5/25 (steps=4) | loss=0.12631 lr=0.000500
[14:41:04] P2 pong Epoch 6/25 (steps=4) | loss=0.11644 lr=0.000500
[14:43:21] P2 pong Epoch 7/25 (steps=8) | loss=0.18012 lr=0.000500
[14:45:38] P2 pong Epoch 8/25 (steps=8) | loss=0.17484 lr=0.000500
[14:47:57] P2 pong Epoch 9/25 (steps=8) | loss=0.16717 lr=0.000500
[14:50:15] P2 pong Epoch 10/25 (steps=8) | loss=0.15650 lr=0.000500
[14:52:31] P2 pong Epoch 11/25 (steps=8) | loss=0.14624 lr=0.000500
[14:54:46] P2 pong Epoch 12/25 (steps=8) | loss=0.13932 lr=0.000500
[14:57:01] P2 pong Epoch 13/25 (steps=8) | loss=0.12899 lr=0.000493
[14:59:17] P2 pong Epoch 14/25 (steps=8) | loss=0.11960 lr=0.000471
[15:01:35] P2 pong Epoch 15/25 (steps=8) | loss=0.10872 lr=0.000437
[15:03:52] P2 pong Epoch 16/25 (steps=8) | loss=0.09965 lr=0.000392
[15:06:07] P2 pong Epoch 17/25 (steps=8) | loss=0.08785 lr=0.000339
[15:08:27] P2 pong Epoch 18/25 (steps=8) | loss=0.07890 lr=0.000280
[15:10:44] P2 pong Epoch 19/25 (steps=8) | loss=0.06718 lr=0.000220
[15:13:01] P2 pong Epoch 20/25 (steps=8) | loss=0.06123 lr=0.000161
[15:15:20] P2 pong Epoch 21/25 (steps=8) | loss=0.05374 lr=0.000108
[15:17:40] P2 pong Epoch 22/25 (steps=8) | loss=0.04863 lr=0.000063
[15:19:57] P2 pong Epoch 23/25 (steps=8) | loss=0.04435 lr=0.000029
[15:22:13] P2 pong Epoch 24/25 (steps=8) | loss=0.04174 lr=0.000010
[15:24:31] P2 pong Epoch 25/25 (steps=8) | loss=0.04022 lr=0.000010
[15:24:31] pong training complete.
[15:24:31]
=== Training sonic ([40, 80, 160]) ===
[15:24:31] 3,150,686 parameters
[15:24:31] Phase 1: 10 epochs single-step
[15:24:34] 32256 sequences
[15:25:03] P1 sonic Epoch 1/10 | loss=0.08400
[15:25:34] P1 sonic Epoch 2/10 | loss=0.06966
[15:26:03] P1 sonic Epoch 3/10 | loss=0.06589
[15:26:34] P1 sonic Epoch 4/10 | loss=0.06327
[15:27:03] P1 sonic Epoch 5/10 | loss=0.06111
[15:27:33] P1 sonic Epoch 6/10 | loss=0.05881
[15:28:03] P1 sonic Epoch 7/10 | loss=0.05682
[15:28:33] P1 sonic Epoch 8/10 | loss=0.05514
[15:29:02] P1 sonic Epoch 9/10 | loss=0.05358
[15:29:32] P1 sonic Epoch 10/10 | loss=0.05256
[15:29:32] Phase 2: 25 epochs graduated AR
[15:30:57] P2 sonic Epoch 1/25 (steps=2) | loss=0.07446 lr=0.000500
[15:32:15] P2 sonic Epoch 2/25 (steps=2) | loss=0.07291 lr=0.000500
[15:33:41] P2 sonic Epoch 3/25 (steps=2) | loss=0.07128 lr=0.000500
[15:37:15] P2 sonic Epoch 4/25 (steps=4) | loss=0.10220 lr=0.000500
[15:40:50] P2 sonic Epoch 5/25 (steps=4) | loss=0.09976 lr=0.000500
[15:44:24] P2 sonic Epoch 6/25 (steps=4) | loss=0.09779 lr=0.000500
[15:53:05] P2 sonic Epoch 7/25 (steps=8) | loss=0.14037 lr=0.000500
[16:01:41] P2 sonic Epoch 8/25 (steps=8) | loss=0.13753 lr=0.000500
[16:10:26] P2 sonic Epoch 9/25 (steps=8) | loss=0.13476 lr=0.000500
[16:19:08] P2 sonic Epoch 10/25 (steps=8) | loss=0.13232 lr=0.000500
[16:28:05] P2 sonic Epoch 11/25 (steps=8) | loss=0.13010 lr=0.000500
[16:37:18] P2 sonic Epoch 12/25 (steps=8) | loss=0.12790 lr=0.000500
[16:46:19] P2 sonic Epoch 13/25 (steps=8) | loss=0.12592 lr=0.000493
[16:55:21] P2 sonic Epoch 14/25 (steps=8) | loss=0.12408 lr=0.000471
[17:04:34] P2 sonic Epoch 15/25 (steps=8) | loss=0.12210 lr=0.000437
[17:13:54] P2 sonic Epoch 16/25 (steps=8) | loss=0.11900 lr=0.000392
[17:23:04] P2 sonic Epoch 17/25 (steps=8) | loss=0.11596 lr=0.000339
[17:32:08] P2 sonic Epoch 18/25 (steps=8) | loss=0.11287 lr=0.000280
[17:41:13] P2 sonic Epoch 19/25 (steps=8) | loss=0.10939 lr=0.000220
[17:50:18] P2 sonic Epoch 20/25 (steps=8) | loss=0.10548 lr=0.000161
[17:59:23] P2 sonic Epoch 21/25 (steps=8) | loss=0.10183 lr=0.000108
[18:08:26] P2 sonic Epoch 22/25 (steps=8) | loss=0.09841 lr=0.000063
[18:17:35] P2 sonic Epoch 23/25 (steps=8) | loss=0.09526 lr=0.000029
[18:26:41] P2 sonic Epoch 24/25 (steps=8) | loss=0.09337 lr=0.000010
[18:35:42] P2 sonic Epoch 25/25 (steps=8) | loss=0.09193 lr=0.000010
[18:35:42] sonic training complete.
[18:35:42]
=== Training pole_position ([24, 48, 96]) ===
[18:35:42] 1,137,006 parameters
[18:35:42] Phase 1: 10 epochs single-step
[18:35:42] 4284 sequences
[18:35:46] P1 pole_position Epoch 1/10 | loss=0.05831
[18:35:50] P1 pole_position Epoch 2/10 | loss=0.03691
[18:35:54] P1 pole_position Epoch 3/10 | loss=0.03064
[18:35:57] P1 pole_position Epoch 4/10 | loss=0.02707
[18:36:00] P1 pole_position Epoch 5/10 | loss=0.02428
[18:36:04] P1 pole_position Epoch 6/10 | loss=0.02271
[18:36:07] P1 pole_position Epoch 7/10 | loss=0.02128
[18:36:11] P1 pole_position Epoch 8/10 | loss=0.02013
[18:36:15] P1 pole_position Epoch 9/10 | loss=0.01936
[18:36:19] P1 pole_position Epoch 10/10 | loss=0.01879
[18:36:19] Phase 2: 25 epochs graduated AR
[18:36:31] P2 pole_position Epoch 1/25 (steps=2) | loss=0.02742 lr=0.000500
[18:36:42] P2 pole_position Epoch 2/25 (steps=2) | loss=0.02621 lr=0.000500
[18:36:54] P2 pole_position Epoch 3/25 (steps=2) | loss=0.02502 lr=0.000500
[18:37:22] P2 pole_position Epoch 4/25 (steps=4) | loss=0.03779 lr=0.000500
[18:37:51] P2 pole_position Epoch 5/25 (steps=4) | loss=0.03543 lr=0.000500
[18:38:19] P2 pole_position Epoch 6/25 (steps=4) | loss=0.03421 lr=0.000500
[18:39:31] P2 pole_position Epoch 7/25 (steps=8) | loss=0.05263 lr=0.000500
[18:40:42] P2 pole_position Epoch 8/25 (steps=8) | loss=0.05159 lr=0.000500
[18:41:53] P2 pole_position Epoch 9/25 (steps=8) | loss=0.04987 lr=0.000500
[18:43:05] P2 pole_position Epoch 10/25 (steps=8) | loss=0.04848 lr=0.000500
[18:44:17] P2 pole_position Epoch 11/25 (steps=8) | loss=0.04744 lr=0.000500
[18:45:30] P2 pole_position Epoch 12/25 (steps=8) | loss=0.04603 lr=0.000500
[18:46:42] P2 pole_position Epoch 13/25 (steps=8) | loss=0.04495 lr=0.000493
[18:47:54] P2 pole_position Epoch 14/25 (steps=8) | loss=0.04383 lr=0.000471
[18:49:05] P2 pole_position Epoch 15/25 (steps=8) | loss=0.04233 lr=0.000437
[18:50:18] P2 pole_position Epoch 16/25 (steps=8) | loss=0.04089 lr=0.000392
[18:51:30] P2 pole_position Epoch 17/25 (steps=8) | loss=0.03911 lr=0.000339
[18:52:43] P2 pole_position Epoch 18/25 (steps=8) | loss=0.03667 lr=0.000280
[18:53:55] P2 pole_position Epoch 19/25 (steps=8) | loss=0.03494 lr=0.000220
[18:55:06] P2 pole_position Epoch 20/25 (steps=8) | loss=0.03271 lr=0.000161
[18:56:18] P2 pole_position Epoch 21/25 (steps=8) | loss=0.03049 lr=0.000108
[18:57:31] P2 pole_position Epoch 22/25 (steps=8) | loss=0.02831 lr=0.000063
[18:58:44] P2 pole_position Epoch 23/25 (steps=8) | loss=0.02653 lr=0.000029
[18:59:58] P2 pole_position Epoch 24/25 (steps=8) | loss=0.02527 lr=0.000010
[19:01:11] P2 pole_position Epoch 25/25 (steps=8) | loss=0.02460 lr=0.000010
[19:01:11] pole_position training complete.
[19:01:11] Evaluating...
[19:02:25] Val SSIM=0.8626 | {'pong': 0.862, 'sonic': 0.7822, 'pole_position': 0.9435}
[19:02:25] Experiment dir: 12.7 MB
[19:02:25] Training complete.