File size: 7,921 Bytes
99c8044 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | [14:35:51] Device: cuda
[14:35:51]
=== Training pong ([32, 64, 128]) ===
[14:35:51] 2,018,278 parameters
[14:35:51] Phase 1: 10 epochs single-step
[14:35:51] 8568 sequences
[14:36:00] P1 pong Epoch 1/10 | loss=0.14558
[14:36:08] P1 pong Epoch 2/10 | loss=0.10721
[14:36:17] P1 pong Epoch 3/10 | loss=0.09795
[14:36:25] P1 pong Epoch 4/10 | loss=0.08996
[14:36:33] P1 pong Epoch 5/10 | loss=0.08384
[14:36:41] P1 pong Epoch 6/10 | loss=0.07755
[14:36:49] P1 pong Epoch 7/10 | loss=0.06995
[14:36:57] P1 pong Epoch 8/10 | loss=0.06272
[14:37:05] P1 pong Epoch 9/10 | loss=0.05640
[14:37:13] P1 pong Epoch 10/10 | loss=0.05177
[14:37:13] Phase 2: 25 epochs graduated AR
[14:37:37] P2 pong Epoch 1/25 (steps=2) | loss=0.09787 lr=0.000500
[14:37:59] P2 pong Epoch 2/25 (steps=2) | loss=0.08854 lr=0.000500
[14:38:21] P2 pong Epoch 3/25 (steps=2) | loss=0.08343 lr=0.000500
[14:39:15] P2 pong Epoch 4/25 (steps=4) | loss=0.13928 lr=0.000500
[14:40:08] P2 pong Epoch 5/25 (steps=4) | loss=0.12631 lr=0.000500
[14:41:04] P2 pong Epoch 6/25 (steps=4) | loss=0.11644 lr=0.000500
[14:43:21] P2 pong Epoch 7/25 (steps=8) | loss=0.18012 lr=0.000500
[14:45:38] P2 pong Epoch 8/25 (steps=8) | loss=0.17484 lr=0.000500
[14:47:57] P2 pong Epoch 9/25 (steps=8) | loss=0.16717 lr=0.000500
[14:50:15] P2 pong Epoch 10/25 (steps=8) | loss=0.15650 lr=0.000500
[14:52:31] P2 pong Epoch 11/25 (steps=8) | loss=0.14624 lr=0.000500
[14:54:46] P2 pong Epoch 12/25 (steps=8) | loss=0.13932 lr=0.000500
[14:57:01] P2 pong Epoch 13/25 (steps=8) | loss=0.12899 lr=0.000493
[14:59:17] P2 pong Epoch 14/25 (steps=8) | loss=0.11960 lr=0.000471
[15:01:35] P2 pong Epoch 15/25 (steps=8) | loss=0.10872 lr=0.000437
[15:03:52] P2 pong Epoch 16/25 (steps=8) | loss=0.09965 lr=0.000392
[15:06:07] P2 pong Epoch 17/25 (steps=8) | loss=0.08785 lr=0.000339
[15:08:27] P2 pong Epoch 18/25 (steps=8) | loss=0.07890 lr=0.000280
[15:10:44] P2 pong Epoch 19/25 (steps=8) | loss=0.06718 lr=0.000220
[15:13:01] P2 pong Epoch 20/25 (steps=8) | loss=0.06123 lr=0.000161
[15:15:20] P2 pong Epoch 21/25 (steps=8) | loss=0.05374 lr=0.000108
[15:17:40] P2 pong Epoch 22/25 (steps=8) | loss=0.04863 lr=0.000063
[15:19:57] P2 pong Epoch 23/25 (steps=8) | loss=0.04435 lr=0.000029
[15:22:13] P2 pong Epoch 24/25 (steps=8) | loss=0.04174 lr=0.000010
[15:24:31] P2 pong Epoch 25/25 (steps=8) | loss=0.04022 lr=0.000010
[15:24:31] pong training complete.
[15:24:31]
=== Training sonic ([40, 80, 160]) ===
[15:24:31] 3,150,686 parameters
[15:24:31] Phase 1: 10 epochs single-step
[15:24:34] 32256 sequences
[15:25:03] P1 sonic Epoch 1/10 | loss=0.08400
[15:25:34] P1 sonic Epoch 2/10 | loss=0.06966
[15:26:03] P1 sonic Epoch 3/10 | loss=0.06589
[15:26:34] P1 sonic Epoch 4/10 | loss=0.06327
[15:27:03] P1 sonic Epoch 5/10 | loss=0.06111
[15:27:33] P1 sonic Epoch 6/10 | loss=0.05881
[15:28:03] P1 sonic Epoch 7/10 | loss=0.05682
[15:28:33] P1 sonic Epoch 8/10 | loss=0.05514
[15:29:02] P1 sonic Epoch 9/10 | loss=0.05358
[15:29:32] P1 sonic Epoch 10/10 | loss=0.05256
[15:29:32] Phase 2: 25 epochs graduated AR
[15:30:57] P2 sonic Epoch 1/25 (steps=2) | loss=0.07446 lr=0.000500
[15:32:15] P2 sonic Epoch 2/25 (steps=2) | loss=0.07291 lr=0.000500
[15:33:41] P2 sonic Epoch 3/25 (steps=2) | loss=0.07128 lr=0.000500
[15:37:15] P2 sonic Epoch 4/25 (steps=4) | loss=0.10220 lr=0.000500
[15:40:50] P2 sonic Epoch 5/25 (steps=4) | loss=0.09976 lr=0.000500
[15:44:24] P2 sonic Epoch 6/25 (steps=4) | loss=0.09779 lr=0.000500
[15:53:05] P2 sonic Epoch 7/25 (steps=8) | loss=0.14037 lr=0.000500
[16:01:41] P2 sonic Epoch 8/25 (steps=8) | loss=0.13753 lr=0.000500
[16:10:26] P2 sonic Epoch 9/25 (steps=8) | loss=0.13476 lr=0.000500
[16:19:08] P2 sonic Epoch 10/25 (steps=8) | loss=0.13232 lr=0.000500
[16:28:05] P2 sonic Epoch 11/25 (steps=8) | loss=0.13010 lr=0.000500
[16:37:18] P2 sonic Epoch 12/25 (steps=8) | loss=0.12790 lr=0.000500
[16:46:19] P2 sonic Epoch 13/25 (steps=8) | loss=0.12592 lr=0.000493
[16:55:21] P2 sonic Epoch 14/25 (steps=8) | loss=0.12408 lr=0.000471
[17:04:34] P2 sonic Epoch 15/25 (steps=8) | loss=0.12210 lr=0.000437
[17:13:54] P2 sonic Epoch 16/25 (steps=8) | loss=0.11900 lr=0.000392
[17:23:04] P2 sonic Epoch 17/25 (steps=8) | loss=0.11596 lr=0.000339
[17:32:08] P2 sonic Epoch 18/25 (steps=8) | loss=0.11287 lr=0.000280
[17:41:13] P2 sonic Epoch 19/25 (steps=8) | loss=0.10939 lr=0.000220
[17:50:18] P2 sonic Epoch 20/25 (steps=8) | loss=0.10548 lr=0.000161
[17:59:23] P2 sonic Epoch 21/25 (steps=8) | loss=0.10183 lr=0.000108
[18:08:26] P2 sonic Epoch 22/25 (steps=8) | loss=0.09841 lr=0.000063
[18:17:35] P2 sonic Epoch 23/25 (steps=8) | loss=0.09526 lr=0.000029
[18:26:41] P2 sonic Epoch 24/25 (steps=8) | loss=0.09337 lr=0.000010
[18:35:42] P2 sonic Epoch 25/25 (steps=8) | loss=0.09193 lr=0.000010
[18:35:42] sonic training complete.
[18:35:42]
=== Training pole_position ([24, 48, 96]) ===
[18:35:42] 1,137,006 parameters
[18:35:42] Phase 1: 10 epochs single-step
[18:35:42] 4284 sequences
[18:35:46] P1 pole_position Epoch 1/10 | loss=0.05831
[18:35:50] P1 pole_position Epoch 2/10 | loss=0.03691
[18:35:54] P1 pole_position Epoch 3/10 | loss=0.03064
[18:35:57] P1 pole_position Epoch 4/10 | loss=0.02707
[18:36:00] P1 pole_position Epoch 5/10 | loss=0.02428
[18:36:04] P1 pole_position Epoch 6/10 | loss=0.02271
[18:36:07] P1 pole_position Epoch 7/10 | loss=0.02128
[18:36:11] P1 pole_position Epoch 8/10 | loss=0.02013
[18:36:15] P1 pole_position Epoch 9/10 | loss=0.01936
[18:36:19] P1 pole_position Epoch 10/10 | loss=0.01879
[18:36:19] Phase 2: 25 epochs graduated AR
[18:36:31] P2 pole_position Epoch 1/25 (steps=2) | loss=0.02742 lr=0.000500
[18:36:42] P2 pole_position Epoch 2/25 (steps=2) | loss=0.02621 lr=0.000500
[18:36:54] P2 pole_position Epoch 3/25 (steps=2) | loss=0.02502 lr=0.000500
[18:37:22] P2 pole_position Epoch 4/25 (steps=4) | loss=0.03779 lr=0.000500
[18:37:51] P2 pole_position Epoch 5/25 (steps=4) | loss=0.03543 lr=0.000500
[18:38:19] P2 pole_position Epoch 6/25 (steps=4) | loss=0.03421 lr=0.000500
[18:39:31] P2 pole_position Epoch 7/25 (steps=8) | loss=0.05263 lr=0.000500
[18:40:42] P2 pole_position Epoch 8/25 (steps=8) | loss=0.05159 lr=0.000500
[18:41:53] P2 pole_position Epoch 9/25 (steps=8) | loss=0.04987 lr=0.000500
[18:43:05] P2 pole_position Epoch 10/25 (steps=8) | loss=0.04848 lr=0.000500
[18:44:17] P2 pole_position Epoch 11/25 (steps=8) | loss=0.04744 lr=0.000500
[18:45:30] P2 pole_position Epoch 12/25 (steps=8) | loss=0.04603 lr=0.000500
[18:46:42] P2 pole_position Epoch 13/25 (steps=8) | loss=0.04495 lr=0.000493
[18:47:54] P2 pole_position Epoch 14/25 (steps=8) | loss=0.04383 lr=0.000471
[18:49:05] P2 pole_position Epoch 15/25 (steps=8) | loss=0.04233 lr=0.000437
[18:50:18] P2 pole_position Epoch 16/25 (steps=8) | loss=0.04089 lr=0.000392
[18:51:30] P2 pole_position Epoch 17/25 (steps=8) | loss=0.03911 lr=0.000339
[18:52:43] P2 pole_position Epoch 18/25 (steps=8) | loss=0.03667 lr=0.000280
[18:53:55] P2 pole_position Epoch 19/25 (steps=8) | loss=0.03494 lr=0.000220
[18:55:06] P2 pole_position Epoch 20/25 (steps=8) | loss=0.03271 lr=0.000161
[18:56:18] P2 pole_position Epoch 21/25 (steps=8) | loss=0.03049 lr=0.000108
[18:57:31] P2 pole_position Epoch 22/25 (steps=8) | loss=0.02831 lr=0.000063
[18:58:44] P2 pole_position Epoch 23/25 (steps=8) | loss=0.02653 lr=0.000029
[18:59:58] P2 pole_position Epoch 24/25 (steps=8) | loss=0.02527 lr=0.000010
[19:01:11] P2 pole_position Epoch 25/25 (steps=8) | loss=0.02460 lr=0.000010
[19:01:11] pole_position training complete.
[19:01:11] Evaluating...
[19:02:25] Val SSIM=0.8626 | {'pong': 0.862, 'sonic': 0.7822, 'pole_position': 0.9435}
[19:02:25] Experiment dir: 12.7 MB
[19:02:25] Training complete.
|