| [14:35:51] Device: cuda |
| [14:35:51] |
| === Training pong ([32, 64, 128]) === |
| [14:35:51] 2,018,278 parameters |
| [14:35:51] Phase 1: 10 epochs single-step |
| [14:35:51] 8568 sequences |
| [14:36:00] P1 pong Epoch 1/10 | loss=0.14558 |
| [14:36:08] P1 pong Epoch 2/10 | loss=0.10721 |
| [14:36:17] P1 pong Epoch 3/10 | loss=0.09795 |
| [14:36:25] P1 pong Epoch 4/10 | loss=0.08996 |
| [14:36:33] P1 pong Epoch 5/10 | loss=0.08384 |
| [14:36:41] P1 pong Epoch 6/10 | loss=0.07755 |
| [14:36:49] P1 pong Epoch 7/10 | loss=0.06995 |
| [14:36:57] P1 pong Epoch 8/10 | loss=0.06272 |
| [14:37:05] P1 pong Epoch 9/10 | loss=0.05640 |
| [14:37:13] P1 pong Epoch 10/10 | loss=0.05177 |
| [14:37:13] Phase 2: 25 epochs graduated AR |
| [14:37:37] P2 pong Epoch 1/25 (steps=2) | loss=0.09787 lr=0.000500 |
| [14:37:59] P2 pong Epoch 2/25 (steps=2) | loss=0.08854 lr=0.000500 |
| [14:38:21] P2 pong Epoch 3/25 (steps=2) | loss=0.08343 lr=0.000500 |
| [14:39:15] P2 pong Epoch 4/25 (steps=4) | loss=0.13928 lr=0.000500 |
| [14:40:08] P2 pong Epoch 5/25 (steps=4) | loss=0.12631 lr=0.000500 |
| [14:41:04] P2 pong Epoch 6/25 (steps=4) | loss=0.11644 lr=0.000500 |
| [14:43:21] P2 pong Epoch 7/25 (steps=8) | loss=0.18012 lr=0.000500 |
| [14:45:38] P2 pong Epoch 8/25 (steps=8) | loss=0.17484 lr=0.000500 |
| [14:47:57] P2 pong Epoch 9/25 (steps=8) | loss=0.16717 lr=0.000500 |
| [14:50:15] P2 pong Epoch 10/25 (steps=8) | loss=0.15650 lr=0.000500 |
| [14:52:31] P2 pong Epoch 11/25 (steps=8) | loss=0.14624 lr=0.000500 |
| [14:54:46] P2 pong Epoch 12/25 (steps=8) | loss=0.13932 lr=0.000500 |
| [14:57:01] P2 pong Epoch 13/25 (steps=8) | loss=0.12899 lr=0.000493 |
| [14:59:17] P2 pong Epoch 14/25 (steps=8) | loss=0.11960 lr=0.000471 |
| [15:01:35] P2 pong Epoch 15/25 (steps=8) | loss=0.10872 lr=0.000437 |
| [15:03:52] P2 pong Epoch 16/25 (steps=8) | loss=0.09965 lr=0.000392 |
| [15:06:07] P2 pong Epoch 17/25 (steps=8) | loss=0.08785 lr=0.000339 |
| [15:08:27] P2 pong Epoch 18/25 (steps=8) | loss=0.07890 lr=0.000280 |
| [15:10:44] P2 pong Epoch 19/25 (steps=8) | loss=0.06718 lr=0.000220 |
| [15:13:01] P2 pong Epoch 20/25 (steps=8) | loss=0.06123 lr=0.000161 |
| [15:15:20] P2 pong Epoch 21/25 (steps=8) | loss=0.05374 lr=0.000108 |
| [15:17:40] P2 pong Epoch 22/25 (steps=8) | loss=0.04863 lr=0.000063 |
| [15:19:57] P2 pong Epoch 23/25 (steps=8) | loss=0.04435 lr=0.000029 |
| [15:22:13] P2 pong Epoch 24/25 (steps=8) | loss=0.04174 lr=0.000010 |
| [15:24:31] P2 pong Epoch 25/25 (steps=8) | loss=0.04022 lr=0.000010 |
| [15:24:31] pong training complete. |
| [15:24:31] |
| === Training sonic ([40, 80, 160]) === |
| [15:24:31] 3,150,686 parameters |
| [15:24:31] Phase 1: 10 epochs single-step |
| [15:24:34] 32256 sequences |
| [15:25:03] P1 sonic Epoch 1/10 | loss=0.08400 |
| [15:25:34] P1 sonic Epoch 2/10 | loss=0.06966 |
| [15:26:03] P1 sonic Epoch 3/10 | loss=0.06589 |
| [15:26:34] P1 sonic Epoch 4/10 | loss=0.06327 |
| [15:27:03] P1 sonic Epoch 5/10 | loss=0.06111 |
| [15:27:33] P1 sonic Epoch 6/10 | loss=0.05881 |
| [15:28:03] P1 sonic Epoch 7/10 | loss=0.05682 |
| [15:28:33] P1 sonic Epoch 8/10 | loss=0.05514 |
| [15:29:02] P1 sonic Epoch 9/10 | loss=0.05358 |
| [15:29:32] P1 sonic Epoch 10/10 | loss=0.05256 |
| [15:29:32] Phase 2: 25 epochs graduated AR |
| [15:30:57] P2 sonic Epoch 1/25 (steps=2) | loss=0.07446 lr=0.000500 |
| [15:32:15] P2 sonic Epoch 2/25 (steps=2) | loss=0.07291 lr=0.000500 |
| [15:33:41] P2 sonic Epoch 3/25 (steps=2) | loss=0.07128 lr=0.000500 |
| [15:37:15] P2 sonic Epoch 4/25 (steps=4) | loss=0.10220 lr=0.000500 |
| [15:40:50] P2 sonic Epoch 5/25 (steps=4) | loss=0.09976 lr=0.000500 |
| [15:44:24] P2 sonic Epoch 6/25 (steps=4) | loss=0.09779 lr=0.000500 |
| [15:53:05] P2 sonic Epoch 7/25 (steps=8) | loss=0.14037 lr=0.000500 |
| [16:01:41] P2 sonic Epoch 8/25 (steps=8) | loss=0.13753 lr=0.000500 |
| [16:10:26] P2 sonic Epoch 9/25 (steps=8) | loss=0.13476 lr=0.000500 |
| [16:19:08] P2 sonic Epoch 10/25 (steps=8) | loss=0.13232 lr=0.000500 |
| [16:28:05] P2 sonic Epoch 11/25 (steps=8) | loss=0.13010 lr=0.000500 |
| [16:37:18] P2 sonic Epoch 12/25 (steps=8) | loss=0.12790 lr=0.000500 |
| [16:46:19] P2 sonic Epoch 13/25 (steps=8) | loss=0.12592 lr=0.000493 |
| [16:55:21] P2 sonic Epoch 14/25 (steps=8) | loss=0.12408 lr=0.000471 |
| [17:04:34] P2 sonic Epoch 15/25 (steps=8) | loss=0.12210 lr=0.000437 |
| [17:13:54] P2 sonic Epoch 16/25 (steps=8) | loss=0.11900 lr=0.000392 |
| [17:23:04] P2 sonic Epoch 17/25 (steps=8) | loss=0.11596 lr=0.000339 |
| [17:32:08] P2 sonic Epoch 18/25 (steps=8) | loss=0.11287 lr=0.000280 |
| [17:41:13] P2 sonic Epoch 19/25 (steps=8) | loss=0.10939 lr=0.000220 |
| [17:50:18] P2 sonic Epoch 20/25 (steps=8) | loss=0.10548 lr=0.000161 |
| [17:59:23] P2 sonic Epoch 21/25 (steps=8) | loss=0.10183 lr=0.000108 |
| [18:08:26] P2 sonic Epoch 22/25 (steps=8) | loss=0.09841 lr=0.000063 |
| [18:17:35] P2 sonic Epoch 23/25 (steps=8) | loss=0.09526 lr=0.000029 |
| [18:26:41] P2 sonic Epoch 24/25 (steps=8) | loss=0.09337 lr=0.000010 |
| [18:35:42] P2 sonic Epoch 25/25 (steps=8) | loss=0.09193 lr=0.000010 |
| [18:35:42] sonic training complete. |
| [18:35:42] |
| === Training pole_position ([24, 48, 96]) === |
| [18:35:42] 1,137,006 parameters |
| [18:35:42] Phase 1: 10 epochs single-step |
| [18:35:42] 4284 sequences |
| [18:35:46] P1 pole_position Epoch 1/10 | loss=0.05831 |
| [18:35:50] P1 pole_position Epoch 2/10 | loss=0.03691 |
| [18:35:54] P1 pole_position Epoch 3/10 | loss=0.03064 |
| [18:35:57] P1 pole_position Epoch 4/10 | loss=0.02707 |
| [18:36:00] P1 pole_position Epoch 5/10 | loss=0.02428 |
| [18:36:04] P1 pole_position Epoch 6/10 | loss=0.02271 |
| [18:36:07] P1 pole_position Epoch 7/10 | loss=0.02128 |
| [18:36:11] P1 pole_position Epoch 8/10 | loss=0.02013 |
| [18:36:15] P1 pole_position Epoch 9/10 | loss=0.01936 |
| [18:36:19] P1 pole_position Epoch 10/10 | loss=0.01879 |
| [18:36:19] Phase 2: 25 epochs graduated AR |
| [18:36:31] P2 pole_position Epoch 1/25 (steps=2) | loss=0.02742 lr=0.000500 |
| [18:36:42] P2 pole_position Epoch 2/25 (steps=2) | loss=0.02621 lr=0.000500 |
| [18:36:54] P2 pole_position Epoch 3/25 (steps=2) | loss=0.02502 lr=0.000500 |
| [18:37:22] P2 pole_position Epoch 4/25 (steps=4) | loss=0.03779 lr=0.000500 |
| [18:37:51] P2 pole_position Epoch 5/25 (steps=4) | loss=0.03543 lr=0.000500 |
| [18:38:19] P2 pole_position Epoch 6/25 (steps=4) | loss=0.03421 lr=0.000500 |
| [18:39:31] P2 pole_position Epoch 7/25 (steps=8) | loss=0.05263 lr=0.000500 |
| [18:40:42] P2 pole_position Epoch 8/25 (steps=8) | loss=0.05159 lr=0.000500 |
| [18:41:53] P2 pole_position Epoch 9/25 (steps=8) | loss=0.04987 lr=0.000500 |
| [18:43:05] P2 pole_position Epoch 10/25 (steps=8) | loss=0.04848 lr=0.000500 |
| [18:44:17] P2 pole_position Epoch 11/25 (steps=8) | loss=0.04744 lr=0.000500 |
| [18:45:30] P2 pole_position Epoch 12/25 (steps=8) | loss=0.04603 lr=0.000500 |
| [18:46:42] P2 pole_position Epoch 13/25 (steps=8) | loss=0.04495 lr=0.000493 |
| [18:47:54] P2 pole_position Epoch 14/25 (steps=8) | loss=0.04383 lr=0.000471 |
| [18:49:05] P2 pole_position Epoch 15/25 (steps=8) | loss=0.04233 lr=0.000437 |
| [18:50:18] P2 pole_position Epoch 16/25 (steps=8) | loss=0.04089 lr=0.000392 |
| [18:51:30] P2 pole_position Epoch 17/25 (steps=8) | loss=0.03911 lr=0.000339 |
| [18:52:43] P2 pole_position Epoch 18/25 (steps=8) | loss=0.03667 lr=0.000280 |
| [18:53:55] P2 pole_position Epoch 19/25 (steps=8) | loss=0.03494 lr=0.000220 |
| [18:55:06] P2 pole_position Epoch 20/25 (steps=8) | loss=0.03271 lr=0.000161 |
| [18:56:18] P2 pole_position Epoch 21/25 (steps=8) | loss=0.03049 lr=0.000108 |
| [18:57:31] P2 pole_position Epoch 22/25 (steps=8) | loss=0.02831 lr=0.000063 |
| [18:58:44] P2 pole_position Epoch 23/25 (steps=8) | loss=0.02653 lr=0.000029 |
| [18:59:58] P2 pole_position Epoch 24/25 (steps=8) | loss=0.02527 lr=0.000010 |
| [19:01:11] P2 pole_position Epoch 25/25 (steps=8) | loss=0.02460 lr=0.000010 |
| [19:01:11] pole_position training complete. |
| [19:01:11] Evaluating... |
| [19:02:25] Val SSIM=0.8626 | {'pong': 0.862, 'sonic': 0.7822, 'pole_position': 0.9435} |
| [19:02:25] Experiment dir: 12.7 MB |
| [19:02:25] Training complete. |
|
|