appo-atari-assault / sf_log.txt
MattStammers's picture
Upload folder using huggingface_hub
88b46d3
[2023-09-22 11:51:26,193][36967] Saving configuration to ./train_atari/Assault/config.json...
[2023-09-22 11:51:26,460][36967] Rollout worker 0 uses device cpu
[2023-09-22 11:51:26,460][36967] Rollout worker 1 uses device cpu
[2023-09-22 11:51:26,461][36967] Rollout worker 2 uses device cpu
[2023-09-22 11:51:26,462][36967] Rollout worker 3 uses device cpu
[2023-09-22 11:51:26,462][36967] Rollout worker 4 uses device cpu
[2023-09-22 11:51:26,463][36967] Rollout worker 5 uses device cpu
[2023-09-22 11:51:26,463][36967] Rollout worker 6 uses device cpu
[2023-09-22 11:51:26,464][36967] Rollout worker 7 uses device cpu
[2023-09-22 11:51:26,464][36967] In synchronous mode, we only accumulate one batch. Setting num_batches_to_accumulate to 1
[2023-09-22 11:51:26,512][36967] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 11:51:26,512][36967] InferenceWorker_p0-w0: min num requests: 1
[2023-09-22 11:51:26,515][36967] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 11:51:26,515][36967] InferenceWorker_p1-w0: min num requests: 1
[2023-09-22 11:51:26,539][36967] Starting all processes...
[2023-09-22 11:51:26,539][36967] Starting process learner_proc0
[2023-09-22 11:51:28,172][36967] Starting process learner_proc1
[2023-09-22 11:51:28,175][37819] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 11:51:28,175][37819] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2023-09-22 11:51:28,215][37819] Num visible devices: 1
[2023-09-22 11:51:28,273][37819] Starting seed is not provided
[2023-09-22 11:51:28,274][37819] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 11:51:28,274][37819] Initializing actor-critic model on device cuda:0
[2023-09-22 11:51:28,275][37819] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 11:51:28,276][37819] RunningMeanStd input shape: (1,)
[2023-09-22 11:51:28,296][37819] ConvEncoder: input_channels=4
[2023-09-22 11:51:28,458][37819] Conv encoder output size: 512
[2023-09-22 11:51:28,460][37819] Created Actor Critic model with architecture:
[2023-09-22 11:51:28,460][37819] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=7, bias=True)
)
)
[2023-09-22 11:51:29,042][37819] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-22 11:51:29,043][37819] No checkpoints found
[2023-09-22 11:51:29,043][37819] Did not load from checkpoint, starting from scratch!
[2023-09-22 11:51:29,043][37819] Initialized policy 0 weights for model version 0
[2023-09-22 11:51:29,044][37819] LearnerWorker_p0 finished initialization!
[2023-09-22 11:51:29,045][37819] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 11:51:29,874][36967] Starting all processes...
[2023-09-22 11:51:29,878][37891] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 11:51:29,878][37891] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for learning process 1
[2023-09-22 11:51:29,881][36967] Starting process inference_proc0-0
[2023-09-22 11:51:29,881][36967] Starting process inference_proc1-0
[2023-09-22 11:51:29,882][36967] Starting process rollout_proc0
[2023-09-22 11:51:29,882][36967] Starting process rollout_proc1
[2023-09-22 11:51:29,895][37891] Num visible devices: 1
[2023-09-22 11:51:29,882][36967] Starting process rollout_proc2
[2023-09-22 11:51:29,883][36967] Starting process rollout_proc3
[2023-09-22 11:51:29,886][36967] Starting process rollout_proc4
[2023-09-22 11:51:29,887][36967] Starting process rollout_proc5
[2023-09-22 11:51:29,890][36967] Starting process rollout_proc6
[2023-09-22 11:51:29,891][36967] Starting process rollout_proc7
[2023-09-22 11:51:29,945][37891] Starting seed is not provided
[2023-09-22 11:51:29,945][37891] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-22 11:51:29,945][37891] Initializing actor-critic model on device cuda:0
[2023-09-22 11:51:29,946][37891] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 11:51:29,946][37891] RunningMeanStd input shape: (1,)
[2023-09-22 11:51:29,965][37891] ConvEncoder: input_channels=4
[2023-09-22 11:51:30,407][37891] Conv encoder output size: 512
[2023-09-22 11:51:30,409][37891] Created Actor Critic model with architecture:
[2023-09-22 11:51:30,409][37891] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): MultiInputEncoder(
(encoders): ModuleDict(
(obs): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ReLU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ReLU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ReLU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ReLU)
)
)
)
)
)
(core): ModelCoreIdentity()
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=7, bias=True)
)
)
[2023-09-22 11:51:30,997][37891] Using optimizer <class 'torch.optim.adam.Adam'>
[2023-09-22 11:51:30,998][37891] No checkpoints found
[2023-09-22 11:51:30,998][37891] Did not load from checkpoint, starting from scratch!
[2023-09-22 11:51:30,998][37891] Initialized policy 1 weights for model version 0
[2023-09-22 11:51:31,000][37891] LearnerWorker_p1 finished initialization!
[2023-09-22 11:51:31,000][37891] Using GPUs [0] for process 1 (actually maps to GPUs [1])
[2023-09-22 11:51:31,959][38168] Worker 6 uses CPU cores [24, 25, 26, 27]
[2023-09-22 11:51:31,960][38130] Worker 2 uses CPU cores [8, 9, 10, 11]
[2023-09-22 11:51:31,975][38129] Worker 1 uses CPU cores [4, 5, 6, 7]
[2023-09-22 11:51:31,979][38131] Worker 3 uses CPU cores [12, 13, 14, 15]
[2023-09-22 11:51:31,989][38127] Using GPUs [1] for process 1 (actually maps to GPUs [1])
[2023-09-22 11:51:31,989][38127] Set environment var CUDA_VISIBLE_DEVICES to '1' (GPU indices [1]) for inference process 1
[2023-09-22 11:51:32,007][38127] Num visible devices: 1
[2023-09-22 11:51:32,040][38167] Worker 5 uses CPU cores [20, 21, 22, 23]
[2023-09-22 11:51:32,067][38166] Worker 7 uses CPU cores [28, 29, 30, 31]
[2023-09-22 11:51:32,082][38132] Worker 4 uses CPU cores [16, 17, 18, 19]
[2023-09-22 11:51:32,129][38128] Worker 0 uses CPU cores [0, 1, 2, 3]
[2023-09-22 11:51:32,167][38126] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2023-09-22 11:51:32,167][38126] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2023-09-22 11:51:32,186][38126] Num visible devices: 1
[2023-09-22 11:51:32,209][36967] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan, 1: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2023-09-22 11:51:32,639][38127] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 11:51:32,640][38127] RunningMeanStd input shape: (1,)
[2023-09-22 11:51:32,652][38127] ConvEncoder: input_channels=4
[2023-09-22 11:51:32,754][38127] Conv encoder output size: 512
[2023-09-22 11:51:32,760][36967] Inference worker 1-0 is ready!
[2023-09-22 11:51:32,766][38126] RunningMeanStd input shape: (4, 84, 84)
[2023-09-22 11:51:32,766][38126] RunningMeanStd input shape: (1,)
[2023-09-22 11:51:32,778][38126] ConvEncoder: input_channels=4
[2023-09-22 11:51:32,877][38126] Conv encoder output size: 512
[2023-09-22 11:51:32,883][36967] Inference worker 0-0 is ready!
[2023-09-22 11:51:32,883][36967] All inference workers are ready! Signal rollout workers to start!
[2023-09-22 11:51:33,330][38132] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,334][38129] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,342][38128] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,365][38167] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,454][38166] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,464][38168] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,466][38131] Decorrelating experience for 0 frames...
[2023-09-22 11:51:33,475][38130] Decorrelating experience for 0 frames...
[2023-09-22 11:51:37,209][36967] Fps is (10 sec: 1638.3, 60 sec: 1638.3, 300 sec: 1638.3). Total num frames: 8192. Throughput: 0: 204.8, 1: 204.8. Samples: 2048. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:51:37,210][36967] Avg episode reward: [(0, '4.200'), (1, '3.333')]
[2023-09-22 11:51:42,209][36967] Fps is (10 sec: 2457.6, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 24576. Throughput: 0: 390.7, 1: 382.5. Samples: 7732. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:51:42,210][36967] Avg episode reward: [(0, '2.893'), (1, '2.767')]
[2023-09-22 11:51:46,500][36967] Heartbeat connected on Batcher_0
[2023-09-22 11:51:46,502][36967] Heartbeat connected on LearnerWorker_p0
[2023-09-22 11:51:46,505][36967] Heartbeat connected on Batcher_1
[2023-09-22 11:51:46,508][36967] Heartbeat connected on LearnerWorker_p1
[2023-09-22 11:51:46,514][36967] Heartbeat connected on InferenceWorker_p0-w0
[2023-09-22 11:51:46,518][36967] Heartbeat connected on InferenceWorker_p1-w0
[2023-09-22 11:51:46,519][36967] Heartbeat connected on RolloutWorker_w0
[2023-09-22 11:51:46,522][36967] Heartbeat connected on RolloutWorker_w1
[2023-09-22 11:51:46,524][36967] Heartbeat connected on RolloutWorker_w2
[2023-09-22 11:51:46,527][36967] Heartbeat connected on RolloutWorker_w3
[2023-09-22 11:51:46,529][36967] Heartbeat connected on RolloutWorker_w4
[2023-09-22 11:51:46,533][36967] Heartbeat connected on RolloutWorker_w5
[2023-09-22 11:51:46,535][36967] Heartbeat connected on RolloutWorker_w6
[2023-09-22 11:51:46,538][36967] Heartbeat connected on RolloutWorker_w7
[2023-09-22 11:51:47,209][36967] Fps is (10 sec: 4915.2, 60 sec: 3822.9, 300 sec: 3822.9). Total num frames: 57344. Throughput: 0: 409.6, 1: 406.0. Samples: 12234. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 11:51:47,210][36967] Avg episode reward: [(0, '3.185'), (1, '3.073')]
[2023-09-22 11:51:50,228][38127] Updated weights for policy 1, policy_version 160 (0.0015)
[2023-09-22 11:51:50,229][38126] Updated weights for policy 0, policy_version 160 (0.0018)
[2023-09-22 11:51:52,209][36967] Fps is (10 sec: 6553.5, 60 sec: 4505.6, 300 sec: 4505.6). Total num frames: 90112. Throughput: 0: 543.9, 1: 536.3. Samples: 21606. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:51:52,210][36967] Avg episode reward: [(0, '3.250'), (1, '3.325')]
[2023-09-22 11:51:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 4915.2, 300 sec: 4915.2). Total num frames: 122880. Throughput: 0: 617.2, 1: 614.4. Samples: 30792. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:51:57,209][36967] Avg episode reward: [(0, '3.130'), (1, '3.400')]
[2023-09-22 11:52:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 5188.2, 300 sec: 5188.2). Total num frames: 155648. Throughput: 0: 594.4, 1: 591.8. Samples: 35585. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:02,210][36967] Avg episode reward: [(0, '3.260'), (1, '3.560')]
[2023-09-22 11:52:03,354][38127] Updated weights for policy 1, policy_version 320 (0.0013)
[2023-09-22 11:52:03,354][38126] Updated weights for policy 0, policy_version 320 (0.0014)
[2023-09-22 11:52:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 5149.2, 300 sec: 5149.2). Total num frames: 180224. Throughput: 0: 643.7, 1: 643.7. Samples: 45056. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 11:52:07,210][36967] Avg episode reward: [(0, '3.450'), (1, '3.510')]
[2023-09-22 11:52:12,209][36967] Fps is (10 sec: 5734.6, 60 sec: 5324.8, 300 sec: 5324.8). Total num frames: 212992. Throughput: 0: 676.3, 1: 673.5. Samples: 53992. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 11:52:12,209][36967] Avg episode reward: [(0, '3.550'), (1, '3.560')]
[2023-09-22 11:52:12,210][37819] Saving new best policy, reward=3.550!
[2023-09-22 11:52:12,210][37891] Saving new best policy, reward=3.560!
[2023-09-22 11:52:16,830][38126] Updated weights for policy 0, policy_version 480 (0.0019)
[2023-09-22 11:52:16,831][38127] Updated weights for policy 1, policy_version 480 (0.0020)
[2023-09-22 11:52:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 5461.3, 300 sec: 5461.3). Total num frames: 245760. Throughput: 0: 654.0, 1: 651.8. Samples: 58763. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 11:52:17,210][36967] Avg episode reward: [(0, '3.790'), (1, '3.560')]
[2023-09-22 11:52:17,211][37819] Saving new best policy, reward=3.790!
[2023-09-22 11:52:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 5570.5, 300 sec: 5570.5). Total num frames: 278528. Throughput: 0: 732.4, 1: 730.0. Samples: 67858. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 11:52:22,210][36967] Avg episode reward: [(0, '4.040'), (1, '3.850')]
[2023-09-22 11:52:22,215][37891] Saving new best policy, reward=3.850!
[2023-09-22 11:52:22,215][37819] Saving new best policy, reward=4.040!
[2023-09-22 11:52:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 5659.9, 300 sec: 5659.9). Total num frames: 311296. Throughput: 0: 776.2, 1: 776.3. Samples: 77597. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:52:27,210][36967] Avg episode reward: [(0, '3.940'), (1, '4.030')]
[2023-09-22 11:52:27,211][37891] Saving new best policy, reward=4.030!
[2023-09-22 11:52:29,811][38126] Updated weights for policy 0, policy_version 640 (0.0018)
[2023-09-22 11:52:29,811][38127] Updated weights for policy 1, policy_version 640 (0.0019)
[2023-09-22 11:52:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 5597.9, 300 sec: 5597.9). Total num frames: 335872. Throughput: 0: 775.1, 1: 774.9. Samples: 81984. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:52:32,210][36967] Avg episode reward: [(0, '4.000'), (1, '4.200')]
[2023-09-22 11:52:32,365][37891] Saving new best policy, reward=4.200!
[2023-09-22 11:52:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 5671.4). Total num frames: 368640. Throughput: 0: 778.4, 1: 779.5. Samples: 91709. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:37,210][36967] Avg episode reward: [(0, '4.250'), (1, '4.280')]
[2023-09-22 11:52:37,217][37891] Saving new best policy, reward=4.280!
[2023-09-22 11:52:37,217][37819] Saving new best policy, reward=4.250!
[2023-09-22 11:52:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 5734.4). Total num frames: 401408. Throughput: 0: 779.3, 1: 778.3. Samples: 100883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:42,209][36967] Avg episode reward: [(0, '4.420'), (1, '4.240')]
[2023-09-22 11:52:42,210][37819] Saving new best policy, reward=4.420!
[2023-09-22 11:52:42,905][38127] Updated weights for policy 1, policy_version 800 (0.0018)
[2023-09-22 11:52:42,906][38126] Updated weights for policy 0, policy_version 800 (0.0015)
[2023-09-22 11:52:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 5789.0). Total num frames: 434176. Throughput: 0: 779.3, 1: 778.3. Samples: 105677. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:47,210][36967] Avg episode reward: [(0, '4.310'), (1, '3.980')]
[2023-09-22 11:52:52,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 5836.8). Total num frames: 466944. Throughput: 0: 779.2, 1: 776.8. Samples: 115077. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:52,210][36967] Avg episode reward: [(0, '4.200'), (1, '4.250')]
[2023-09-22 11:52:55,923][38126] Updated weights for policy 0, policy_version 960 (0.0019)
[2023-09-22 11:52:55,923][38127] Updated weights for policy 1, policy_version 960 (0.0017)
[2023-09-22 11:52:57,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 5830.8). Total num frames: 495616. Throughput: 0: 784.4, 1: 784.7. Samples: 124601. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:52:57,210][36967] Avg episode reward: [(0, '4.030'), (1, '4.420')]
[2023-09-22 11:52:57,211][37891] Saving new best policy, reward=4.420!
[2023-09-22 11:53:02,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 5825.4). Total num frames: 524288. Throughput: 0: 779.6, 1: 781.8. Samples: 129025. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 11:53:02,209][36967] Avg episode reward: [(0, '3.850'), (1, '4.330')]
[2023-09-22 11:53:07,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 5863.7). Total num frames: 557056. Throughput: 0: 785.3, 1: 785.3. Samples: 138537. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:53:07,210][36967] Avg episode reward: [(0, '3.630'), (1, '4.440')]
[2023-09-22 11:53:07,222][37891] Saving new best policy, reward=4.440!
[2023-09-22 11:53:09,037][38126] Updated weights for policy 0, policy_version 1120 (0.0018)
[2023-09-22 11:53:09,037][38127] Updated weights for policy 1, policy_version 1120 (0.0018)
[2023-09-22 11:53:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 5898.2). Total num frames: 589824. Throughput: 0: 778.1, 1: 777.3. Samples: 147590. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 11:53:12,209][36967] Avg episode reward: [(0, '3.970'), (1, '4.400')]
[2023-09-22 11:53:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 5929.4). Total num frames: 622592. Throughput: 0: 782.0, 1: 781.0. Samples: 152318. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:53:17,210][36967] Avg episode reward: [(0, '3.970'), (1, '4.360')]
[2023-09-22 11:53:22,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 5883.3). Total num frames: 647168. Throughput: 0: 777.6, 1: 777.6. Samples: 161693. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:53:22,209][36967] Avg episode reward: [(0, '3.930'), (1, '4.570')]
[2023-09-22 11:53:22,217][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000001264_323584.pth...
[2023-09-22 11:53:22,249][37891] Saving new best policy, reward=4.570!
[2023-09-22 11:53:22,416][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000001280_327680.pth...
[2023-09-22 11:53:22,484][38126] Updated weights for policy 0, policy_version 1280 (0.0020)
[2023-09-22 11:53:22,485][38127] Updated weights for policy 1, policy_version 1280 (0.0019)
[2023-09-22 11:53:27,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5912.5). Total num frames: 679936. Throughput: 0: 776.4, 1: 775.8. Samples: 170734. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:53:27,209][36967] Avg episode reward: [(0, '4.270'), (1, '4.680')]
[2023-09-22 11:53:27,210][37891] Saving new best policy, reward=4.680!
[2023-09-22 11:53:32,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 5939.2). Total num frames: 712704. Throughput: 0: 776.5, 1: 777.1. Samples: 175588. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:53:32,210][36967] Avg episode reward: [(0, '4.290'), (1, '4.550')]
[2023-09-22 11:53:35,504][38127] Updated weights for policy 1, policy_version 1440 (0.0016)
[2023-09-22 11:53:35,504][38126] Updated weights for policy 0, policy_version 1440 (0.0017)
[2023-09-22 11:53:37,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 5963.8). Total num frames: 745472. Throughput: 0: 775.8, 1: 775.3. Samples: 184877. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:53:37,210][36967] Avg episode reward: [(0, '4.450'), (1, '5.120')]
[2023-09-22 11:53:37,222][37891] Saving new best policy, reward=5.120!
[2023-09-22 11:53:37,222][37819] Saving new best policy, reward=4.450!
[2023-09-22 11:53:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5923.4). Total num frames: 770048. Throughput: 0: 774.4, 1: 774.4. Samples: 194296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:53:42,210][36967] Avg episode reward: [(0, '4.820'), (1, '5.230')]
[2023-09-22 11:53:42,249][37891] Saving new best policy, reward=5.230!
[2023-09-22 11:53:42,253][37819] Saving new best policy, reward=4.820!
[2023-09-22 11:53:47,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 5946.8). Total num frames: 802816. Throughput: 0: 774.6, 1: 773.7. Samples: 198699. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:53:47,210][36967] Avg episode reward: [(0, '4.910'), (1, '5.540')]
[2023-09-22 11:53:47,391][37891] Saving new best policy, reward=5.540!
[2023-09-22 11:53:47,396][37819] Saving new best policy, reward=4.910!
[2023-09-22 11:53:48,742][38127] Updated weights for policy 1, policy_version 1600 (0.0016)
[2023-09-22 11:53:48,743][38126] Updated weights for policy 0, policy_version 1600 (0.0015)
[2023-09-22 11:53:52,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 5968.4). Total num frames: 835584. Throughput: 0: 772.8, 1: 771.4. Samples: 208024. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:53:52,210][36967] Avg episode reward: [(0, '4.740'), (1, '5.730')]
[2023-09-22 11:53:52,219][37891] Saving new best policy, reward=5.730!
[2023-09-22 11:53:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6212.3, 300 sec: 5988.6). Total num frames: 868352. Throughput: 0: 772.9, 1: 773.6. Samples: 217180. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:53:57,209][36967] Avg episode reward: [(0, '4.790'), (1, '6.050')]
[2023-09-22 11:53:57,210][37891] Saving new best policy, reward=6.050!
[2023-09-22 11:54:01,997][38126] Updated weights for policy 0, policy_version 1760 (0.0016)
[2023-09-22 11:54:01,997][38127] Updated weights for policy 1, policy_version 1760 (0.0015)
[2023-09-22 11:54:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6007.5). Total num frames: 901120. Throughput: 0: 775.1, 1: 774.4. Samples: 222045. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:54:02,209][36967] Avg episode reward: [(0, '4.770'), (1, '5.900')]
[2023-09-22 11:54:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5972.2). Total num frames: 925696. Throughput: 0: 773.7, 1: 775.9. Samples: 231424. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 11:54:07,210][36967] Avg episode reward: [(0, '4.830'), (1, '5.790')]
[2023-09-22 11:54:12,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5990.4). Total num frames: 958464. Throughput: 0: 775.5, 1: 776.4. Samples: 240571. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:54:12,210][36967] Avg episode reward: [(0, '5.000'), (1, '5.730')]
[2023-09-22 11:54:12,211][37819] Saving new best policy, reward=5.000!
[2023-09-22 11:54:15,227][38126] Updated weights for policy 0, policy_version 1920 (0.0016)
[2023-09-22 11:54:15,227][38127] Updated weights for policy 1, policy_version 1920 (0.0018)
[2023-09-22 11:54:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6007.5). Total num frames: 991232. Throughput: 0: 776.1, 1: 775.5. Samples: 245413. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:54:17,209][36967] Avg episode reward: [(0, '5.330'), (1, '5.940')]
[2023-09-22 11:54:17,210][37819] Saving new best policy, reward=5.330!
[2023-09-22 11:54:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6023.5). Total num frames: 1024000. Throughput: 0: 775.1, 1: 775.4. Samples: 254651. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 11:54:22,209][36967] Avg episode reward: [(0, '5.390'), (1, '6.350')]
[2023-09-22 11:54:22,217][37891] Saving new best policy, reward=6.350!
[2023-09-22 11:54:22,217][37819] Saving new best policy, reward=5.390!
[2023-09-22 11:54:27,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 5991.9). Total num frames: 1048576. Throughput: 0: 773.0, 1: 772.4. Samples: 263839. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:54:27,210][36967] Avg episode reward: [(0, '5.170'), (1, '6.650')]
[2023-09-22 11:54:27,255][37891] Saving new best policy, reward=6.650!
[2023-09-22 11:54:28,535][38126] Updated weights for policy 0, policy_version 2080 (0.0017)
[2023-09-22 11:54:28,536][38127] Updated weights for policy 1, policy_version 2080 (0.0015)
[2023-09-22 11:54:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6007.5). Total num frames: 1081344. Throughput: 0: 773.2, 1: 773.7. Samples: 268308. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 11:54:32,209][36967] Avg episode reward: [(0, '5.590'), (1, '6.870')]
[2023-09-22 11:54:32,210][37891] Saving new best policy, reward=6.870!
[2023-09-22 11:54:32,388][37819] Saving new best policy, reward=5.590!
[2023-09-22 11:54:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6022.2). Total num frames: 1114112. Throughput: 0: 776.5, 1: 778.1. Samples: 277980. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 11:54:37,210][36967] Avg episode reward: [(0, '5.360'), (1, '7.190')]
[2023-09-22 11:54:37,220][37891] Saving new best policy, reward=7.190!
[2023-09-22 11:54:41,586][38126] Updated weights for policy 0, policy_version 2240 (0.0016)
[2023-09-22 11:54:41,586][38127] Updated weights for policy 1, policy_version 2240 (0.0017)
[2023-09-22 11:54:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6036.2). Total num frames: 1146880. Throughput: 0: 777.2, 1: 777.0. Samples: 287123. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:54:42,209][36967] Avg episode reward: [(0, '5.300'), (1, '7.200')]
[2023-09-22 11:54:42,210][37891] Saving new best policy, reward=7.200!
[2023-09-22 11:54:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6049.5). Total num frames: 1179648. Throughput: 0: 776.8, 1: 777.7. Samples: 291998. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:54:47,210][36967] Avg episode reward: [(0, '5.330'), (1, '7.350')]
[2023-09-22 11:54:47,211][37891] Saving new best policy, reward=7.350!
[2023-09-22 11:54:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6062.1). Total num frames: 1212416. Throughput: 0: 775.2, 1: 773.7. Samples: 301123. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 11:54:52,209][36967] Avg episode reward: [(0, '5.190'), (1, '7.750')]
[2023-09-22 11:54:52,216][37891] Saving new best policy, reward=7.750!
[2023-09-22 11:54:54,920][38127] Updated weights for policy 1, policy_version 2400 (0.0016)
[2023-09-22 11:54:54,920][38126] Updated weights for policy 0, policy_version 2400 (0.0016)
[2023-09-22 11:54:57,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6034.1). Total num frames: 1236992. Throughput: 0: 775.3, 1: 774.5. Samples: 310312. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 11:54:57,209][36967] Avg episode reward: [(0, '5.540'), (1, '7.780')]
[2023-09-22 11:54:57,210][37891] Saving new best policy, reward=7.780!
[2023-09-22 11:55:02,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6046.5). Total num frames: 1269760. Throughput: 0: 775.8, 1: 775.0. Samples: 315200. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 11:55:02,210][36967] Avg episode reward: [(0, '5.550'), (1, '7.510')]
[2023-09-22 11:55:07,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6058.3). Total num frames: 1302528. Throughput: 0: 775.6, 1: 775.7. Samples: 324460. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:55:07,210][36967] Avg episode reward: [(0, '5.600'), (1, '7.720')]
[2023-09-22 11:55:07,220][37819] Saving new best policy, reward=5.600!
[2023-09-22 11:55:07,967][38126] Updated weights for policy 0, policy_version 2560 (0.0015)
[2023-09-22 11:55:07,968][38127] Updated weights for policy 1, policy_version 2560 (0.0016)
[2023-09-22 11:55:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6069.5). Total num frames: 1335296. Throughput: 0: 776.2, 1: 779.1. Samples: 333829. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:55:12,210][36967] Avg episode reward: [(0, '6.050'), (1, '7.800')]
[2023-09-22 11:55:12,211][37819] Saving new best policy, reward=6.050!
[2023-09-22 11:55:12,211][37891] Saving new best policy, reward=7.800!
[2023-09-22 11:55:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6043.9). Total num frames: 1359872. Throughput: 0: 779.1, 1: 776.9. Samples: 338328. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:55:17,209][36967] Avg episode reward: [(0, '6.190'), (1, '7.690')]
[2023-09-22 11:55:17,236][37819] Saving new best policy, reward=6.190!
[2023-09-22 11:55:21,346][38127] Updated weights for policy 1, policy_version 2720 (0.0017)
[2023-09-22 11:55:21,347][38126] Updated weights for policy 0, policy_version 2720 (0.0017)
[2023-09-22 11:55:22,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6055.0). Total num frames: 1392640. Throughput: 0: 775.9, 1: 774.8. Samples: 347761. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:55:22,209][36967] Avg episode reward: [(0, '6.270'), (1, '7.660')]
[2023-09-22 11:55:22,217][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000002720_696320.pth...
[2023-09-22 11:55:22,217][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000002720_696320.pth...
[2023-09-22 11:55:22,253][37819] Saving new best policy, reward=6.270!
[2023-09-22 11:55:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6065.6). Total num frames: 1425408. Throughput: 0: 772.6, 1: 771.9. Samples: 356626. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:55:27,209][36967] Avg episode reward: [(0, '6.760'), (1, '8.080')]
[2023-09-22 11:55:27,210][37819] Saving new best policy, reward=6.760!
[2023-09-22 11:55:27,210][37891] Saving new best policy, reward=8.080!
[2023-09-22 11:55:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6075.7). Total num frames: 1458176. Throughput: 0: 769.0, 1: 769.6. Samples: 361238. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:55:32,209][36967] Avg episode reward: [(0, '6.840'), (1, '7.460')]
[2023-09-22 11:55:32,210][37819] Saving new best policy, reward=6.840!
[2023-09-22 11:55:34,831][38126] Updated weights for policy 0, policy_version 2880 (0.0019)
[2023-09-22 11:55:34,831][38127] Updated weights for policy 1, policy_version 2880 (0.0018)
[2023-09-22 11:55:37,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6052.0). Total num frames: 1482752. Throughput: 0: 771.1, 1: 770.7. Samples: 370507. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:55:37,209][36967] Avg episode reward: [(0, '7.000'), (1, '7.690')]
[2023-09-22 11:55:37,218][37819] Saving new best policy, reward=7.000!
[2023-09-22 11:55:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6062.1). Total num frames: 1515520. Throughput: 0: 766.4, 1: 766.7. Samples: 379303. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:55:42,209][36967] Avg episode reward: [(0, '7.080'), (1, '7.670')]
[2023-09-22 11:55:42,210][37819] Saving new best policy, reward=7.080!
[2023-09-22 11:55:47,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6071.7). Total num frames: 1548288. Throughput: 0: 763.9, 1: 764.9. Samples: 383995. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 11:55:47,210][36967] Avg episode reward: [(0, '7.060'), (1, '7.940')]
[2023-09-22 11:55:48,277][38126] Updated weights for policy 0, policy_version 3040 (0.0016)
[2023-09-22 11:55:48,277][38127] Updated weights for policy 1, policy_version 3040 (0.0018)
[2023-09-22 11:55:52,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6007.4, 300 sec: 6049.5). Total num frames: 1572864. Throughput: 0: 762.6, 1: 765.3. Samples: 393216. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 11:55:52,210][36967] Avg episode reward: [(0, '7.130'), (1, '7.970')]
[2023-09-22 11:55:52,273][37819] Saving new best policy, reward=7.130!
[2023-09-22 11:55:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6059.0). Total num frames: 1605632. Throughput: 0: 764.8, 1: 761.5. Samples: 402514. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 11:55:57,210][36967] Avg episode reward: [(0, '7.010'), (1, '7.840')]
[2023-09-22 11:56:01,500][38126] Updated weights for policy 0, policy_version 3200 (0.0013)
[2023-09-22 11:56:01,501][38127] Updated weights for policy 1, policy_version 3200 (0.0019)
[2023-09-22 11:56:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6068.1). Total num frames: 1638400. Throughput: 0: 766.2, 1: 766.8. Samples: 407317. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 11:56:02,210][36967] Avg episode reward: [(0, '7.100'), (1, '8.260')]
[2023-09-22 11:56:02,211][37891] Saving new best policy, reward=8.260!
[2023-09-22 11:56:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6077.0). Total num frames: 1671168. Throughput: 0: 763.7, 1: 764.4. Samples: 416527. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 11:56:07,210][36967] Avg episode reward: [(0, '7.420'), (1, '7.750')]
[2023-09-22 11:56:07,217][37819] Saving new best policy, reward=7.420!
[2023-09-22 11:56:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6085.5). Total num frames: 1703936. Throughput: 0: 769.3, 1: 770.5. Samples: 425915. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:56:12,209][36967] Avg episode reward: [(0, '7.010'), (1, '8.070')]
[2023-09-22 11:56:14,780][38127] Updated weights for policy 1, policy_version 3360 (0.0016)
[2023-09-22 11:56:14,780][38126] Updated weights for policy 0, policy_version 3360 (0.0016)
[2023-09-22 11:56:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6065.0). Total num frames: 1728512. Throughput: 0: 766.0, 1: 765.8. Samples: 430173. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:56:17,209][36967] Avg episode reward: [(0, '7.130'), (1, '8.130')]
[2023-09-22 11:56:22,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6073.4). Total num frames: 1761280. Throughput: 0: 766.3, 1: 765.5. Samples: 439437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:56:22,209][36967] Avg episode reward: [(0, '7.130'), (1, '7.960')]
[2023-09-22 11:56:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6081.5). Total num frames: 1794048. Throughput: 0: 768.3, 1: 770.4. Samples: 448542. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:56:27,209][36967] Avg episode reward: [(0, '7.110'), (1, '8.300')]
[2023-09-22 11:56:27,210][37891] Saving new best policy, reward=8.300!
[2023-09-22 11:56:28,160][38127] Updated weights for policy 1, policy_version 3520 (0.0018)
[2023-09-22 11:56:28,160][38126] Updated weights for policy 0, policy_version 3520 (0.0016)
[2023-09-22 11:56:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 1826816. Throughput: 0: 769.4, 1: 769.2. Samples: 453231. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:56:32,210][36967] Avg episode reward: [(0, '7.300'), (1, '8.340')]
[2023-09-22 11:56:32,212][37891] Saving new best policy, reward=8.340!
[2023-09-22 11:56:37,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1851392. Throughput: 0: 772.6, 1: 769.8. Samples: 462624. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 11:56:37,210][36967] Avg episode reward: [(0, '7.690'), (1, '8.360')]
[2023-09-22 11:56:37,219][37819] Saving new best policy, reward=7.690!
[2023-09-22 11:56:37,220][37891] Saving new best policy, reward=8.360!
[2023-09-22 11:56:41,624][38126] Updated weights for policy 0, policy_version 3680 (0.0012)
[2023-09-22 11:56:41,624][38127] Updated weights for policy 1, policy_version 3680 (0.0013)
[2023-09-22 11:56:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1884160. Throughput: 0: 765.2, 1: 766.0. Samples: 471420. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:56:42,209][36967] Avg episode reward: [(0, '7.760'), (1, '8.600')]
[2023-09-22 11:56:42,210][37819] Saving new best policy, reward=7.760!
[2023-09-22 11:56:42,210][37891] Saving new best policy, reward=8.600!
[2023-09-22 11:56:47,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 1916928. Throughput: 0: 765.1, 1: 764.8. Samples: 476163. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 11:56:47,209][36967] Avg episode reward: [(0, '7.690'), (1, '8.370')]
[2023-09-22 11:56:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 1941504. Throughput: 0: 763.7, 1: 766.3. Samples: 485376. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 11:56:52,209][36967] Avg episode reward: [(0, '7.830'), (1, '8.590')]
[2023-09-22 11:56:52,379][37819] Saving new best policy, reward=7.830!
[2023-09-22 11:56:55,232][38127] Updated weights for policy 1, policy_version 3840 (0.0017)
[2023-09-22 11:56:55,238][38126] Updated weights for policy 0, policy_version 3840 (0.0016)
[2023-09-22 11:56:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 1974272. Throughput: 0: 756.7, 1: 755.3. Samples: 493955. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 11:56:57,210][36967] Avg episode reward: [(0, '7.830'), (1, '8.310')]
[2023-09-22 11:57:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2007040. Throughput: 0: 762.0, 1: 761.6. Samples: 498731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:02,209][36967] Avg episode reward: [(0, '7.890'), (1, '8.490')]
[2023-09-22 11:57:02,210][37819] Saving new best policy, reward=7.890!
[2023-09-22 11:57:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 2039808. Throughput: 0: 759.5, 1: 762.1. Samples: 507909. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:07,209][36967] Avg episode reward: [(0, '7.960'), (1, '8.670')]
[2023-09-22 11:57:07,218][37819] Saving new best policy, reward=7.960!
[2023-09-22 11:57:07,218][37891] Saving new best policy, reward=8.670!
[2023-09-22 11:57:08,447][38127] Updated weights for policy 1, policy_version 4000 (0.0015)
[2023-09-22 11:57:08,447][38126] Updated weights for policy 0, policy_version 4000 (0.0016)
[2023-09-22 11:57:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2064384. Throughput: 0: 763.4, 1: 761.2. Samples: 517146. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 11:57:12,209][36967] Avg episode reward: [(0, '8.140'), (1, '8.570')]
[2023-09-22 11:57:12,210][37819] Saving new best policy, reward=8.140!
[2023-09-22 11:57:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2097152. Throughput: 0: 761.4, 1: 761.8. Samples: 521773. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:17,210][36967] Avg episode reward: [(0, '8.290'), (1, '8.490')]
[2023-09-22 11:57:17,211][37819] Saving new best policy, reward=8.290!
[2023-09-22 11:57:22,038][38126] Updated weights for policy 0, policy_version 4160 (0.0016)
[2023-09-22 11:57:22,038][38127] Updated weights for policy 1, policy_version 4160 (0.0015)
[2023-09-22 11:57:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2129920. Throughput: 0: 757.0, 1: 757.1. Samples: 530759. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:57:22,209][36967] Avg episode reward: [(0, '8.190'), (1, '8.830')]
[2023-09-22 11:57:22,217][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000004160_1064960.pth...
[2023-09-22 11:57:22,217][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000004160_1064960.pth...
[2023-09-22 11:57:22,247][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000001280_327680.pth
[2023-09-22 11:57:22,254][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000001264_323584.pth
[2023-09-22 11:57:22,258][37891] Saving new best policy, reward=8.830!
[2023-09-22 11:57:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2154496. Throughput: 0: 763.5, 1: 763.4. Samples: 540129. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:27,210][36967] Avg episode reward: [(0, '8.060'), (1, '8.370')]
[2023-09-22 11:57:32,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6007.5, 300 sec: 6164.8). Total num frames: 2187264. Throughput: 0: 761.1, 1: 763.4. Samples: 544768. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:57:32,210][36967] Avg episode reward: [(0, '8.150'), (1, '8.130')]
[2023-09-22 11:57:35,311][38126] Updated weights for policy 0, policy_version 4320 (0.0015)
[2023-09-22 11:57:35,311][38127] Updated weights for policy 1, policy_version 4320 (0.0015)
[2023-09-22 11:57:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2220032. Throughput: 0: 762.3, 1: 760.1. Samples: 553887. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 11:57:37,210][36967] Avg episode reward: [(0, '7.810'), (1, '8.390')]
[2023-09-22 11:57:42,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2252800. Throughput: 0: 767.9, 1: 770.1. Samples: 563164. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:42,209][36967] Avg episode reward: [(0, '8.170'), (1, '8.610')]
[2023-09-22 11:57:47,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6137.1). Total num frames: 2277376. Throughput: 0: 762.4, 1: 763.0. Samples: 567374. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:47,209][36967] Avg episode reward: [(0, '8.130'), (1, '8.580')]
[2023-09-22 11:57:49,035][38126] Updated weights for policy 0, policy_version 4480 (0.0017)
[2023-09-22 11:57:49,036][38127] Updated weights for policy 1, policy_version 4480 (0.0015)
[2023-09-22 11:57:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6150.9). Total num frames: 2310144. Throughput: 0: 761.8, 1: 759.5. Samples: 576366. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:52,209][36967] Avg episode reward: [(0, '8.130'), (1, '8.880')]
[2023-09-22 11:57:52,218][37891] Saving new best policy, reward=8.880!
[2023-09-22 11:57:57,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2342912. Throughput: 0: 760.6, 1: 763.4. Samples: 585724. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:57:57,210][36967] Avg episode reward: [(0, '8.070'), (1, '8.350')]
[2023-09-22 11:58:02,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6075.7, 300 sec: 6150.9). Total num frames: 2371584. Throughput: 0: 760.2, 1: 759.8. Samples: 590173. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 11:58:02,210][36967] Avg episode reward: [(0, '8.180'), (1, '8.610')]
[2023-09-22 11:58:02,223][38126] Updated weights for policy 0, policy_version 4640 (0.0013)
[2023-09-22 11:58:02,223][38127] Updated weights for policy 1, policy_version 4640 (0.0016)
[2023-09-22 11:58:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.4, 300 sec: 6137.1). Total num frames: 2400256. Throughput: 0: 765.7, 1: 766.6. Samples: 599713. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:58:07,210][36967] Avg episode reward: [(0, '8.320'), (1, '8.690')]
[2023-09-22 11:58:07,220][37819] Saving new best policy, reward=8.320!
[2023-09-22 11:58:12,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2433024. Throughput: 0: 760.6, 1: 761.0. Samples: 608601. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 11:58:12,209][36967] Avg episode reward: [(0, '8.240'), (1, '8.960')]
[2023-09-22 11:58:12,210][37891] Saving new best policy, reward=8.960!
[2023-09-22 11:58:15,621][38126] Updated weights for policy 0, policy_version 4800 (0.0014)
[2023-09-22 11:58:15,622][38127] Updated weights for policy 1, policy_version 4800 (0.0016)
[2023-09-22 11:58:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2465792. Throughput: 0: 764.2, 1: 760.6. Samples: 613382. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:58:17,210][36967] Avg episode reward: [(0, '7.970'), (1, '8.950')]
[2023-09-22 11:58:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2498560. Throughput: 0: 762.5, 1: 764.5. Samples: 622600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:58:22,209][36967] Avg episode reward: [(0, '8.130'), (1, '8.910')]
[2023-09-22 11:58:27,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2523136. Throughput: 0: 765.3, 1: 763.2. Samples: 631946. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:58:27,209][36967] Avg episode reward: [(0, '8.150'), (1, '8.680')]
[2023-09-22 11:58:28,801][38127] Updated weights for policy 1, policy_version 4960 (0.0013)
[2023-09-22 11:58:28,802][38126] Updated weights for policy 0, policy_version 4960 (0.0015)
[2023-09-22 11:58:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2555904. Throughput: 0: 770.9, 1: 770.6. Samples: 636740. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 11:58:32,209][36967] Avg episode reward: [(0, '8.260'), (1, '8.630')]
[2023-09-22 11:58:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2588672. Throughput: 0: 773.8, 1: 773.8. Samples: 646007. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 11:58:37,209][36967] Avg episode reward: [(0, '8.310'), (1, '8.630')]
[2023-09-22 11:58:42,052][38126] Updated weights for policy 0, policy_version 5120 (0.0014)
[2023-09-22 11:58:42,053][38127] Updated weights for policy 1, policy_version 5120 (0.0017)
[2023-09-22 11:58:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2621440. Throughput: 0: 773.7, 1: 773.1. Samples: 655330. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:58:42,209][36967] Avg episode reward: [(0, '8.610'), (1, '8.180')]
[2023-09-22 11:58:42,210][37819] Saving new best policy, reward=8.610!
[2023-09-22 11:58:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2646016. Throughput: 0: 770.1, 1: 771.1. Samples: 659528. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 11:58:47,210][36967] Avg episode reward: [(0, '8.420'), (1, '8.150')]
[2023-09-22 11:58:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2678784. Throughput: 0: 765.2, 1: 764.8. Samples: 668560. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 11:58:52,209][36967] Avg episode reward: [(0, '8.490'), (1, '8.320')]
[2023-09-22 11:58:55,758][38126] Updated weights for policy 0, policy_version 5280 (0.0014)
[2023-09-22 11:58:55,760][38127] Updated weights for policy 1, policy_version 5280 (0.0016)
[2023-09-22 11:58:57,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2711552. Throughput: 0: 768.7, 1: 769.6. Samples: 677823. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:58:57,209][36967] Avg episode reward: [(0, '8.150'), (1, '8.280')]
[2023-09-22 11:59:02,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6075.7, 300 sec: 6137.1). Total num frames: 2736128. Throughput: 0: 764.6, 1: 765.6. Samples: 682238. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:02,210][36967] Avg episode reward: [(0, '8.230'), (1, '8.200')]
[2023-09-22 11:59:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2768896. Throughput: 0: 772.8, 1: 770.4. Samples: 692047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:07,210][36967] Avg episode reward: [(0, '8.440'), (1, '8.090')]
[2023-09-22 11:59:08,735][38127] Updated weights for policy 1, policy_version 5440 (0.0016)
[2023-09-22 11:59:08,735][38126] Updated weights for policy 0, policy_version 5440 (0.0017)
[2023-09-22 11:59:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2801664. Throughput: 0: 768.5, 1: 768.4. Samples: 701106. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:59:12,209][36967] Avg episode reward: [(0, '8.460'), (1, '8.460')]
[2023-09-22 11:59:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2834432. Throughput: 0: 768.2, 1: 767.6. Samples: 705855. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:17,209][36967] Avg episode reward: [(0, '8.330'), (1, '8.570')]
[2023-09-22 11:59:22,076][38127] Updated weights for policy 1, policy_version 5600 (0.0016)
[2023-09-22 11:59:22,077][38126] Updated weights for policy 0, policy_version 5600 (0.0014)
[2023-09-22 11:59:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 2867200. Throughput: 0: 766.9, 1: 766.6. Samples: 715015. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:22,209][36967] Avg episode reward: [(0, '8.540'), (1, '8.390')]
[2023-09-22 11:59:22,220][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000005600_1433600.pth...
[2023-09-22 11:59:22,220][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000005600_1433600.pth...
[2023-09-22 11:59:22,255][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000002720_696320.pth
[2023-09-22 11:59:22,258][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000002720_696320.pth
[2023-09-22 11:59:27,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2891776. Throughput: 0: 771.3, 1: 769.4. Samples: 724660. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:27,210][36967] Avg episode reward: [(0, '8.640'), (1, '8.560')]
[2023-09-22 11:59:27,212][37819] Saving new best policy, reward=8.640!
[2023-09-22 11:59:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2924544. Throughput: 0: 774.4, 1: 773.8. Samples: 729197. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 11:59:32,209][36967] Avg episode reward: [(0, '8.480'), (1, '8.790')]
[2023-09-22 11:59:34,980][38126] Updated weights for policy 0, policy_version 5760 (0.0016)
[2023-09-22 11:59:34,981][38127] Updated weights for policy 1, policy_version 5760 (0.0017)
[2023-09-22 11:59:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2957312. Throughput: 0: 781.8, 1: 780.9. Samples: 738883. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:37,210][36967] Avg episode reward: [(0, '8.650'), (1, '8.530')]
[2023-09-22 11:59:37,218][37819] Saving new best policy, reward=8.650!
[2023-09-22 11:59:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 2990080. Throughput: 0: 782.8, 1: 781.8. Samples: 748228. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:42,209][36967] Avg episode reward: [(0, '8.790'), (1, '8.490')]
[2023-09-22 11:59:42,210][37819] Saving new best policy, reward=8.790!
[2023-09-22 11:59:47,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6137.1). Total num frames: 3022848. Throughput: 0: 786.1, 1: 786.7. Samples: 753017. Policy #0 lag: (min: 8.0, avg: 8.0, max: 8.0)
[2023-09-22 11:59:47,209][36967] Avg episode reward: [(0, '8.650'), (1, '8.650')]
[2023-09-22 11:59:48,029][38127] Updated weights for policy 1, policy_version 5920 (0.0015)
[2023-09-22 11:59:48,029][38126] Updated weights for policy 0, policy_version 5920 (0.0017)
[2023-09-22 11:59:52,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3055616. Throughput: 0: 779.1, 1: 779.6. Samples: 762188. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 11:59:52,210][36967] Avg episode reward: [(0, '8.730'), (1, '8.490')]
[2023-09-22 11:59:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 3080192. Throughput: 0: 780.5, 1: 781.3. Samples: 771387. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 11:59:57,209][36967] Avg episode reward: [(0, '8.650'), (1, '8.880')]
[2023-09-22 12:00:01,516][38126] Updated weights for policy 0, policy_version 6080 (0.0015)
[2023-09-22 12:00:01,517][38127] Updated weights for policy 1, policy_version 6080 (0.0014)
[2023-09-22 12:00:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 3112960. Throughput: 0: 780.3, 1: 779.7. Samples: 776055. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:02,210][36967] Avg episode reward: [(0, '8.620'), (1, '8.080')]
[2023-09-22 12:00:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 3145728. Throughput: 0: 779.5, 1: 779.4. Samples: 785166. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:07,209][36967] Avg episode reward: [(0, '8.930'), (1, '8.690')]
[2023-09-22 12:00:07,215][37819] Saving new best policy, reward=8.930!
[2023-09-22 12:00:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3178496. Throughput: 0: 776.1, 1: 777.4. Samples: 794568. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:12,210][36967] Avg episode reward: [(0, '9.350'), (1, '8.540')]
[2023-09-22 12:00:12,212][37819] Saving new best policy, reward=9.350!
[2023-09-22 12:00:14,749][38127] Updated weights for policy 1, policy_version 6240 (0.0013)
[2023-09-22 12:00:14,750][38126] Updated weights for policy 0, policy_version 6240 (0.0014)
[2023-09-22 12:00:17,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 3203072. Throughput: 0: 773.9, 1: 773.8. Samples: 798846. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:00:17,209][36967] Avg episode reward: [(0, '8.960'), (1, '8.450')]
[2023-09-22 12:00:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 3235840. Throughput: 0: 771.5, 1: 772.2. Samples: 808349. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:00:22,209][36967] Avg episode reward: [(0, '9.090'), (1, '8.220')]
[2023-09-22 12:00:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6137.1). Total num frames: 3268608. Throughput: 0: 770.2, 1: 769.8. Samples: 817527. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:27,210][36967] Avg episode reward: [(0, '9.090'), (1, '8.170')]
[2023-09-22 12:00:27,967][38126] Updated weights for policy 0, policy_version 6400 (0.0017)
[2023-09-22 12:00:27,968][38127] Updated weights for policy 1, policy_version 6400 (0.0018)
[2023-09-22 12:00:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3301376. Throughput: 0: 770.2, 1: 768.6. Samples: 822264. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:32,210][36967] Avg episode reward: [(0, '9.010'), (1, '8.540')]
[2023-09-22 12:00:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3334144. Throughput: 0: 770.4, 1: 771.1. Samples: 831554. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:37,210][36967] Avg episode reward: [(0, '9.250'), (1, '7.940')]
[2023-09-22 12:00:41,032][38127] Updated weights for policy 1, policy_version 6560 (0.0017)
[2023-09-22 12:00:41,032][38126] Updated weights for policy 0, policy_version 6560 (0.0016)
[2023-09-22 12:00:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 3358720. Throughput: 0: 775.4, 1: 775.3. Samples: 841168. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:42,210][36967] Avg episode reward: [(0, '9.080'), (1, '8.400')]
[2023-09-22 12:00:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3391488. Throughput: 0: 773.8, 1: 776.7. Samples: 845828. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:47,210][36967] Avg episode reward: [(0, '9.000'), (1, '8.640')]
[2023-09-22 12:00:52,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3424256. Throughput: 0: 782.0, 1: 782.0. Samples: 855547. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:00:52,209][36967] Avg episode reward: [(0, '8.950'), (1, '8.220')]
[2023-09-22 12:00:53,978][38126] Updated weights for policy 0, policy_version 6720 (0.0017)
[2023-09-22 12:00:53,979][38127] Updated weights for policy 1, policy_version 6720 (0.0017)
[2023-09-22 12:00:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3457024. Throughput: 0: 778.4, 1: 776.8. Samples: 864554. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:00:57,210][36967] Avg episode reward: [(0, '8.690'), (1, '8.570')]
[2023-09-22 12:01:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3489792. Throughput: 0: 783.7, 1: 783.4. Samples: 869366. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:01:02,210][36967] Avg episode reward: [(0, '8.940'), (1, '8.490')]
[2023-09-22 12:01:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6137.1). Total num frames: 3514368. Throughput: 0: 779.2, 1: 781.5. Samples: 878581. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:01:07,210][36967] Avg episode reward: [(0, '8.780'), (1, '8.810')]
[2023-09-22 12:01:07,351][38126] Updated weights for policy 0, policy_version 6880 (0.0017)
[2023-09-22 12:01:07,351][38127] Updated weights for policy 1, policy_version 6880 (0.0015)
[2023-09-22 12:01:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3547136. Throughput: 0: 778.7, 1: 779.4. Samples: 887643. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:01:12,209][36967] Avg episode reward: [(0, '8.860'), (1, '9.000')]
[2023-09-22 12:01:12,210][37891] Saving new best policy, reward=9.000!
[2023-09-22 12:01:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3579904. Throughput: 0: 779.1, 1: 780.5. Samples: 892449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:17,210][36967] Avg episode reward: [(0, '8.610'), (1, '8.860')]
[2023-09-22 12:01:20,491][38126] Updated weights for policy 0, policy_version 7040 (0.0017)
[2023-09-22 12:01:20,491][38127] Updated weights for policy 1, policy_version 7040 (0.0015)
[2023-09-22 12:01:22,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3612672. Throughput: 0: 780.4, 1: 779.5. Samples: 901752. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:22,210][36967] Avg episode reward: [(0, '8.950'), (1, '9.160')]
[2023-09-22 12:01:22,219][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000007056_1806336.pth...
[2023-09-22 12:01:22,220][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000007056_1806336.pth...
[2023-09-22 12:01:22,255][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000004160_1064960.pth
[2023-09-22 12:01:22,258][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000004160_1064960.pth
[2023-09-22 12:01:22,263][37891] Saving new best policy, reward=9.160!
[2023-09-22 12:01:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6164.8). Total num frames: 3645440. Throughput: 0: 778.8, 1: 781.0. Samples: 911360. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:27,210][36967] Avg episode reward: [(0, '9.080'), (1, '9.160')]
[2023-09-22 12:01:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3670016. Throughput: 0: 778.3, 1: 775.5. Samples: 915749. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:01:32,209][36967] Avg episode reward: [(0, '9.150'), (1, '8.830')]
[2023-09-22 12:01:33,681][38126] Updated weights for policy 0, policy_version 7200 (0.0018)
[2023-09-22 12:01:33,681][38127] Updated weights for policy 1, policy_version 7200 (0.0017)
[2023-09-22 12:01:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3702784. Throughput: 0: 774.8, 1: 774.6. Samples: 925269. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:37,210][36967] Avg episode reward: [(0, '8.840'), (1, '8.660')]
[2023-09-22 12:01:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6164.8). Total num frames: 3735552. Throughput: 0: 778.3, 1: 778.4. Samples: 934603. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:42,209][36967] Avg episode reward: [(0, '9.190'), (1, '8.460')]
[2023-09-22 12:01:46,613][38126] Updated weights for policy 0, policy_version 7360 (0.0017)
[2023-09-22 12:01:46,613][38127] Updated weights for policy 1, policy_version 7360 (0.0019)
[2023-09-22 12:01:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3768320. Throughput: 0: 779.2, 1: 779.3. Samples: 939497. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:01:47,210][36967] Avg episode reward: [(0, '9.350'), (1, '8.590')]
[2023-09-22 12:01:52,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3801088. Throughput: 0: 782.8, 1: 780.3. Samples: 948917. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:01:52,210][36967] Avg episode reward: [(0, '9.160'), (1, '8.540')]
[2023-09-22 12:01:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3833856. Throughput: 0: 785.8, 1: 788.0. Samples: 958464. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:01:57,209][36967] Avg episode reward: [(0, '9.580'), (1, '8.650')]
[2023-09-22 12:01:57,210][37819] Saving new best policy, reward=9.580!
[2023-09-22 12:01:59,605][38126] Updated weights for policy 0, policy_version 7520 (0.0016)
[2023-09-22 12:01:59,605][38127] Updated weights for policy 1, policy_version 7520 (0.0016)
[2023-09-22 12:02:02,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6164.8). Total num frames: 3858432. Throughput: 0: 783.9, 1: 783.4. Samples: 962976. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:02:02,210][36967] Avg episode reward: [(0, '9.270'), (1, '8.750')]
[2023-09-22 12:02:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3891200. Throughput: 0: 788.2, 1: 789.2. Samples: 972737. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:02:07,210][36967] Avg episode reward: [(0, '9.040'), (1, '8.900')]
[2023-09-22 12:02:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 3923968. Throughput: 0: 782.7, 1: 780.3. Samples: 981697. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:12,209][36967] Avg episode reward: [(0, '9.360'), (1, '9.180')]
[2023-09-22 12:02:12,210][37891] Saving new best policy, reward=9.180!
[2023-09-22 12:02:12,813][38126] Updated weights for policy 0, policy_version 7680 (0.0012)
[2023-09-22 12:02:12,813][38127] Updated weights for policy 1, policy_version 7680 (0.0014)
[2023-09-22 12:02:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 3956736. Throughput: 0: 785.3, 1: 785.4. Samples: 986428. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:17,209][36967] Avg episode reward: [(0, '9.270'), (1, '8.950')]
[2023-09-22 12:02:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 3989504. Throughput: 0: 781.5, 1: 782.1. Samples: 995632. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:22,209][36967] Avg episode reward: [(0, '9.030'), (1, '9.260')]
[2023-09-22 12:02:22,218][37891] Saving new best policy, reward=9.260!
[2023-09-22 12:02:25,907][38126] Updated weights for policy 0, policy_version 7840 (0.0018)
[2023-09-22 12:02:25,907][38127] Updated weights for policy 1, policy_version 7840 (0.0018)
[2023-09-22 12:02:27,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 4018176. Throughput: 0: 784.7, 1: 785.4. Samples: 1005256. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:27,210][36967] Avg episode reward: [(0, '9.180'), (1, '9.440')]
[2023-09-22 12:02:27,211][37891] Saving new best policy, reward=9.440!
[2023-09-22 12:02:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 4046848. Throughput: 0: 778.9, 1: 780.9. Samples: 1009686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:32,209][36967] Avg episode reward: [(0, '9.010'), (1, '9.110')]
[2023-09-22 12:02:37,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6192.6). Total num frames: 4079616. Throughput: 0: 781.7, 1: 782.0. Samples: 1019282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:02:37,209][36967] Avg episode reward: [(0, '8.870'), (1, '9.620')]
[2023-09-22 12:02:37,218][37891] Saving new best policy, reward=9.620!
[2023-09-22 12:02:39,108][38126] Updated weights for policy 0, policy_version 8000 (0.0015)
[2023-09-22 12:02:39,108][38127] Updated weights for policy 1, policy_version 8000 (0.0017)
[2023-09-22 12:02:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4112384. Throughput: 0: 778.5, 1: 776.0. Samples: 1028418. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:02:42,209][36967] Avg episode reward: [(0, '8.770'), (1, '9.590')]
[2023-09-22 12:02:47,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4145152. Throughput: 0: 779.9, 1: 779.6. Samples: 1033154. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:02:47,210][36967] Avg episode reward: [(0, '8.830'), (1, '9.230')]
[2023-09-22 12:02:52,102][38126] Updated weights for policy 0, policy_version 8160 (0.0015)
[2023-09-22 12:02:52,103][38127] Updated weights for policy 1, policy_version 8160 (0.0015)
[2023-09-22 12:02:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 4177920. Throughput: 0: 775.4, 1: 775.2. Samples: 1042514. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:02:52,209][36967] Avg episode reward: [(0, '8.950'), (1, '9.730')]
[2023-09-22 12:02:52,219][37891] Saving new best policy, reward=9.730!
[2023-09-22 12:02:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6206.5). Total num frames: 4202496. Throughput: 0: 783.2, 1: 782.5. Samples: 1052153. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:02:57,210][36967] Avg episode reward: [(0, '9.020'), (1, '9.990')]
[2023-09-22 12:02:57,302][37891] Saving new best policy, reward=9.990!
[2023-09-22 12:03:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4235264. Throughput: 0: 780.2, 1: 783.0. Samples: 1056772. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:03:02,209][36967] Avg episode reward: [(0, '8.800'), (1, '9.530')]
[2023-09-22 12:03:05,077][38126] Updated weights for policy 0, policy_version 8320 (0.0016)
[2023-09-22 12:03:05,078][38127] Updated weights for policy 1, policy_version 8320 (0.0019)
[2023-09-22 12:03:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4268032. Throughput: 0: 786.7, 1: 785.6. Samples: 1066385. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:03:07,210][36967] Avg episode reward: [(0, '9.100'), (1, '10.090')]
[2023-09-22 12:03:07,219][37891] Saving new best policy, reward=10.090!
[2023-09-22 12:03:12,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4300800. Throughput: 0: 781.9, 1: 781.5. Samples: 1075611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:12,210][36967] Avg episode reward: [(0, '9.020'), (1, '9.980')]
[2023-09-22 12:03:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4333568. Throughput: 0: 787.6, 1: 785.5. Samples: 1080477. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:17,210][36967] Avg episode reward: [(0, '9.130'), (1, '10.510')]
[2023-09-22 12:03:17,211][37891] Saving new best policy, reward=10.510!
[2023-09-22 12:03:18,144][38126] Updated weights for policy 0, policy_version 8480 (0.0014)
[2023-09-22 12:03:18,145][38127] Updated weights for policy 1, policy_version 8480 (0.0016)
[2023-09-22 12:03:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4366336. Throughput: 0: 781.2, 1: 781.9. Samples: 1089624. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:22,210][36967] Avg episode reward: [(0, '8.710'), (1, '10.780')]
[2023-09-22 12:03:22,221][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000008528_2183168.pth...
[2023-09-22 12:03:22,222][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000008528_2183168.pth...
[2023-09-22 12:03:22,255][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000005600_1433600.pth
[2023-09-22 12:03:22,261][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000005600_1433600.pth
[2023-09-22 12:03:22,264][37891] Saving new best policy, reward=10.780!
[2023-09-22 12:03:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 4390912. Throughput: 0: 786.2, 1: 786.9. Samples: 1099204. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:27,210][36967] Avg episode reward: [(0, '8.700'), (1, '11.570')]
[2023-09-22 12:03:27,310][37891] Saving new best policy, reward=11.570!
[2023-09-22 12:03:31,222][38126] Updated weights for policy 0, policy_version 8640 (0.0015)
[2023-09-22 12:03:31,223][38127] Updated weights for policy 1, policy_version 8640 (0.0017)
[2023-09-22 12:03:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4423680. Throughput: 0: 784.4, 1: 787.2. Samples: 1103876. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:32,210][36967] Avg episode reward: [(0, '8.830'), (1, '11.400')]
[2023-09-22 12:03:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 4456448. Throughput: 0: 785.1, 1: 783.2. Samples: 1113087. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:37,210][36967] Avg episode reward: [(0, '9.530'), (1, '11.490')]
[2023-09-22 12:03:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4489216. Throughput: 0: 778.0, 1: 781.0. Samples: 1122305. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:42,210][36967] Avg episode reward: [(0, '9.350'), (1, '11.730')]
[2023-09-22 12:03:42,211][37891] Saving new best policy, reward=11.730!
[2023-09-22 12:03:44,495][38126] Updated weights for policy 0, policy_version 8800 (0.0020)
[2023-09-22 12:03:44,495][38127] Updated weights for policy 1, policy_version 8800 (0.0020)
[2023-09-22 12:03:47,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4521984. Throughput: 0: 782.0, 1: 779.4. Samples: 1127037. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:47,209][36967] Avg episode reward: [(0, '9.550'), (1, '11.640')]
[2023-09-22 12:03:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 4546560. Throughput: 0: 779.0, 1: 781.9. Samples: 1136626. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:52,210][36967] Avg episode reward: [(0, '9.600'), (1, '11.360')]
[2023-09-22 12:03:52,275][37819] Saving new best policy, reward=9.600!
[2023-09-22 12:03:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4579328. Throughput: 0: 781.6, 1: 781.8. Samples: 1145964. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:03:57,210][36967] Avg episode reward: [(0, '9.810'), (1, '11.110')]
[2023-09-22 12:03:57,211][37819] Saving new best policy, reward=9.810!
[2023-09-22 12:03:57,532][38127] Updated weights for policy 1, policy_version 8960 (0.0016)
[2023-09-22 12:03:57,533][38126] Updated weights for policy 0, policy_version 8960 (0.0017)
[2023-09-22 12:04:02,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4612096. Throughput: 0: 781.1, 1: 781.5. Samples: 1150797. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:02,209][36967] Avg episode reward: [(0, '9.910'), (1, '11.110')]
[2023-09-22 12:04:02,210][37819] Saving new best policy, reward=9.910!
[2023-09-22 12:04:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4644864. Throughput: 0: 783.0, 1: 782.4. Samples: 1160066. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:04:07,210][36967] Avg episode reward: [(0, '10.230'), (1, '10.930')]
[2023-09-22 12:04:07,222][37819] Saving new best policy, reward=10.230!
[2023-09-22 12:04:10,533][38127] Updated weights for policy 1, policy_version 9120 (0.0017)
[2023-09-22 12:04:10,533][38126] Updated weights for policy 0, policy_version 9120 (0.0017)
[2023-09-22 12:04:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 4677632. Throughput: 0: 780.3, 1: 781.0. Samples: 1169460. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:04:12,209][36967] Avg episode reward: [(0, '9.520'), (1, '10.400')]
[2023-09-22 12:04:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4710400. Throughput: 0: 783.2, 1: 780.6. Samples: 1174246. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:04:17,210][36967] Avg episode reward: [(0, '9.160'), (1, '10.280')]
[2023-09-22 12:04:22,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 4734976. Throughput: 0: 783.3, 1: 786.5. Samples: 1183727. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:22,210][36967] Avg episode reward: [(0, '9.030'), (1, '10.440')]
[2023-09-22 12:04:23,790][38127] Updated weights for policy 1, policy_version 9280 (0.0016)
[2023-09-22 12:04:23,790][38126] Updated weights for policy 0, policy_version 9280 (0.0018)
[2023-09-22 12:04:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4767744. Throughput: 0: 782.3, 1: 780.4. Samples: 1192625. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:27,210][36967] Avg episode reward: [(0, '9.290'), (1, '9.910')]
[2023-09-22 12:04:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 4800512. Throughput: 0: 783.0, 1: 783.4. Samples: 1197522. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:04:32,209][36967] Avg episode reward: [(0, '9.830'), (1, '10.160')]
[2023-09-22 12:04:36,961][38127] Updated weights for policy 1, policy_version 9440 (0.0017)
[2023-09-22 12:04:36,962][38126] Updated weights for policy 0, policy_version 9440 (0.0015)
[2023-09-22 12:04:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4833280. Throughput: 0: 778.7, 1: 776.3. Samples: 1206600. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:37,210][36967] Avg episode reward: [(0, '10.060'), (1, '10.080')]
[2023-09-22 12:04:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4866048. Throughput: 0: 782.7, 1: 782.1. Samples: 1216381. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:42,209][36967] Avg episode reward: [(0, '9.590'), (1, '10.230')]
[2023-09-22 12:04:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 4890624. Throughput: 0: 778.0, 1: 777.5. Samples: 1220793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:04:47,210][36967] Avg episode reward: [(0, '9.790'), (1, '10.530')]
[2023-09-22 12:04:50,041][38126] Updated weights for policy 0, policy_version 9600 (0.0013)
[2023-09-22 12:04:50,041][38127] Updated weights for policy 1, policy_version 9600 (0.0015)
[2023-09-22 12:04:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 4923392. Throughput: 0: 780.5, 1: 780.3. Samples: 1230301. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:04:52,209][36967] Avg episode reward: [(0, '9.880'), (1, '11.120')]
[2023-09-22 12:04:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4956160. Throughput: 0: 779.2, 1: 777.8. Samples: 1239526. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:04:57,210][36967] Avg episode reward: [(0, '9.960'), (1, '10.220')]
[2023-09-22 12:05:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 4988928. Throughput: 0: 779.1, 1: 779.5. Samples: 1244383. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:05:02,209][36967] Avg episode reward: [(0, '10.370'), (1, '10.020')]
[2023-09-22 12:05:02,210][37819] Saving new best policy, reward=10.370!
[2023-09-22 12:05:03,063][38127] Updated weights for policy 1, policy_version 9760 (0.0015)
[2023-09-22 12:05:03,063][38126] Updated weights for policy 0, policy_version 9760 (0.0017)
[2023-09-22 12:05:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5021696. Throughput: 0: 778.9, 1: 776.5. Samples: 1253718. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:05:07,210][36967] Avg episode reward: [(0, '9.780'), (1, '10.400')]
[2023-09-22 12:05:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5054464. Throughput: 0: 787.7, 1: 786.5. Samples: 1263465. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:12,209][36967] Avg episode reward: [(0, '10.070'), (1, '10.130')]
[2023-09-22 12:05:16,151][38127] Updated weights for policy 1, policy_version 9920 (0.0017)
[2023-09-22 12:05:16,152][38126] Updated weights for policy 0, policy_version 9920 (0.0017)
[2023-09-22 12:05:17,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5079040. Throughput: 0: 779.6, 1: 781.1. Samples: 1267753. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:17,209][36967] Avg episode reward: [(0, '9.800'), (1, '9.980')]
[2023-09-22 12:05:22,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5111808. Throughput: 0: 787.7, 1: 786.8. Samples: 1277453. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:22,210][36967] Avg episode reward: [(0, '9.870'), (1, '9.970')]
[2023-09-22 12:05:22,221][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000009984_2555904.pth...
[2023-09-22 12:05:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000009984_2555904.pth...
[2023-09-22 12:05:22,257][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000007056_1806336.pth
[2023-09-22 12:05:22,265][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000007056_1806336.pth
[2023-09-22 12:05:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5144576. Throughput: 0: 779.9, 1: 779.9. Samples: 1286573. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:27,209][36967] Avg episode reward: [(0, '9.610'), (1, '9.620')]
[2023-09-22 12:05:29,242][38126] Updated weights for policy 0, policy_version 10080 (0.0015)
[2023-09-22 12:05:29,242][38127] Updated weights for policy 1, policy_version 10080 (0.0017)
[2023-09-22 12:05:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5177344. Throughput: 0: 783.8, 1: 783.3. Samples: 1291311. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:32,209][36967] Avg episode reward: [(0, '9.430'), (1, '10.270')]
[2023-09-22 12:05:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 5210112. Throughput: 0: 783.3, 1: 783.2. Samples: 1300794. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:37,209][36967] Avg episode reward: [(0, '9.870'), (1, '10.070')]
[2023-09-22 12:05:42,193][38126] Updated weights for policy 0, policy_version 10240 (0.0017)
[2023-09-22 12:05:42,193][38127] Updated weights for policy 1, policy_version 10240 (0.0016)
[2023-09-22 12:05:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5242880. Throughput: 0: 788.4, 1: 788.1. Samples: 1310466. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:05:42,209][36967] Avg episode reward: [(0, '10.830'), (1, '10.070')]
[2023-09-22 12:05:42,210][37819] Saving new best policy, reward=10.830!
[2023-09-22 12:05:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5267456. Throughput: 0: 783.4, 1: 783.8. Samples: 1314906. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:05:47,210][36967] Avg episode reward: [(0, '10.910'), (1, '10.660')]
[2023-09-22 12:05:47,345][37819] Saving new best policy, reward=10.910!
[2023-09-22 12:05:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5300224. Throughput: 0: 786.7, 1: 786.8. Samples: 1324529. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:05:52,210][36967] Avg episode reward: [(0, '10.590'), (1, '10.920')]
[2023-09-22 12:05:55,330][38127] Updated weights for policy 1, policy_version 10400 (0.0017)
[2023-09-22 12:05:55,330][38126] Updated weights for policy 0, policy_version 10400 (0.0019)
[2023-09-22 12:05:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5332992. Throughput: 0: 777.4, 1: 777.9. Samples: 1333452. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:05:57,209][36967] Avg episode reward: [(0, '10.400'), (1, '11.240')]
[2023-09-22 12:06:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5365760. Throughput: 0: 780.5, 1: 778.6. Samples: 1337913. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:02,210][36967] Avg episode reward: [(0, '10.690'), (1, '12.020')]
[2023-09-22 12:06:02,212][37891] Saving new best policy, reward=12.020!
[2023-09-22 12:06:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5390336. Throughput: 0: 777.5, 1: 778.4. Samples: 1347469. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:07,210][36967] Avg episode reward: [(0, '10.870'), (1, '11.900')]
[2023-09-22 12:06:08,642][38127] Updated weights for policy 1, policy_version 10560 (0.0015)
[2023-09-22 12:06:08,643][38126] Updated weights for policy 0, policy_version 10560 (0.0016)
[2023-09-22 12:06:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5423104. Throughput: 0: 781.4, 1: 781.6. Samples: 1356906. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:06:12,210][36967] Avg episode reward: [(0, '11.670'), (1, '12.000')]
[2023-09-22 12:06:12,210][37819] Saving new best policy, reward=11.670!
[2023-09-22 12:06:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5455872. Throughput: 0: 782.9, 1: 784.4. Samples: 1361842. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:06:17,210][36967] Avg episode reward: [(0, '11.630'), (1, '12.160')]
[2023-09-22 12:06:17,211][37891] Saving new best policy, reward=12.160!
[2023-09-22 12:06:21,731][38126] Updated weights for policy 0, policy_version 10720 (0.0014)
[2023-09-22 12:06:21,731][38127] Updated weights for policy 1, policy_version 10720 (0.0016)
[2023-09-22 12:06:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 5488640. Throughput: 0: 780.2, 1: 779.7. Samples: 1370990. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:22,209][36967] Avg episode reward: [(0, '11.460'), (1, '11.780')]
[2023-09-22 12:06:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5521408. Throughput: 0: 775.1, 1: 778.0. Samples: 1380354. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:27,209][36967] Avg episode reward: [(0, '11.410'), (1, '11.630')]
[2023-09-22 12:06:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5554176. Throughput: 0: 779.1, 1: 778.2. Samples: 1384983. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:32,209][36967] Avg episode reward: [(0, '11.520'), (1, '10.880')]
[2023-09-22 12:06:34,809][38126] Updated weights for policy 0, policy_version 10880 (0.0017)
[2023-09-22 12:06:34,810][38127] Updated weights for policy 1, policy_version 10880 (0.0017)
[2023-09-22 12:06:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5578752. Throughput: 0: 777.7, 1: 777.6. Samples: 1394517. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:37,209][36967] Avg episode reward: [(0, '11.370'), (1, '11.140')]
[2023-09-22 12:06:42,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5611520. Throughput: 0: 781.8, 1: 781.6. Samples: 1403804. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:42,210][36967] Avg episode reward: [(0, '10.390'), (1, '11.350')]
[2023-09-22 12:06:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5644288. Throughput: 0: 786.1, 1: 786.6. Samples: 1408684. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:47,210][36967] Avg episode reward: [(0, '9.600'), (1, '11.770')]
[2023-09-22 12:06:47,836][38126] Updated weights for policy 0, policy_version 11040 (0.0015)
[2023-09-22 12:06:47,837][38127] Updated weights for policy 1, policy_version 11040 (0.0017)
[2023-09-22 12:06:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5677056. Throughput: 0: 781.8, 1: 782.1. Samples: 1417846. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:06:52,210][36967] Avg episode reward: [(0, '9.330'), (1, '11.520')]
[2023-09-22 12:06:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 5709824. Throughput: 0: 782.5, 1: 784.6. Samples: 1427428. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:06:57,209][36967] Avg episode reward: [(0, '9.400'), (1, '11.950')]
[2023-09-22 12:07:00,948][38127] Updated weights for policy 1, policy_version 11200 (0.0013)
[2023-09-22 12:07:00,948][38126] Updated weights for policy 0, policy_version 11200 (0.0015)
[2023-09-22 12:07:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5734400. Throughput: 0: 778.0, 1: 777.1. Samples: 1431821. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:07:02,210][36967] Avg episode reward: [(0, '9.390'), (1, '11.840')]
[2023-09-22 12:07:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5767168. Throughput: 0: 783.3, 1: 784.4. Samples: 1441533. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:07:07,209][36967] Avg episode reward: [(0, '10.290'), (1, '12.060')]
[2023-09-22 12:07:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5799936. Throughput: 0: 781.7, 1: 779.0. Samples: 1450586. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:07:12,209][36967] Avg episode reward: [(0, '10.500'), (1, '12.590')]
[2023-09-22 12:07:12,210][37891] Saving new best policy, reward=12.590!
[2023-09-22 12:07:14,096][38127] Updated weights for policy 1, policy_version 11360 (0.0015)
[2023-09-22 12:07:14,097][38126] Updated weights for policy 0, policy_version 11360 (0.0015)
[2023-09-22 12:07:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5832704. Throughput: 0: 783.4, 1: 783.7. Samples: 1455503. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:17,210][36967] Avg episode reward: [(0, '10.570'), (1, '12.340')]
[2023-09-22 12:07:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 5865472. Throughput: 0: 778.2, 1: 778.4. Samples: 1464564. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:22,210][36967] Avg episode reward: [(0, '10.450'), (1, '12.430')]
[2023-09-22 12:07:22,221][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000011456_2932736.pth...
[2023-09-22 12:07:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000011456_2932736.pth...
[2023-09-22 12:07:22,256][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000008528_2183168.pth
[2023-09-22 12:07:22,259][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000008528_2183168.pth
[2023-09-22 12:07:27,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6212.2, 300 sec: 6262.0). Total num frames: 5894144. Throughput: 0: 784.0, 1: 783.4. Samples: 1474334. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:27,210][36967] Avg episode reward: [(0, '10.620'), (1, '12.860')]
[2023-09-22 12:07:27,213][37891] Saving new best policy, reward=12.860!
[2023-09-22 12:07:27,216][38126] Updated weights for policy 0, policy_version 11520 (0.0016)
[2023-09-22 12:07:27,216][38127] Updated weights for policy 1, policy_version 11520 (0.0016)
[2023-09-22 12:07:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 5922816. Throughput: 0: 776.3, 1: 778.0. Samples: 1478629. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:32,210][36967] Avg episode reward: [(0, '10.870'), (1, '12.880')]
[2023-09-22 12:07:32,211][37891] Saving new best policy, reward=12.880!
[2023-09-22 12:07:37,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 5955584. Throughput: 0: 775.2, 1: 775.2. Samples: 1487610. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:37,210][36967] Avg episode reward: [(0, '9.410'), (1, '13.510')]
[2023-09-22 12:07:37,219][37891] Saving new best policy, reward=13.510!
[2023-09-22 12:07:40,695][38126] Updated weights for policy 0, policy_version 11680 (0.0016)
[2023-09-22 12:07:40,695][38127] Updated weights for policy 1, policy_version 11680 (0.0016)
[2023-09-22 12:07:42,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 5988352. Throughput: 0: 773.7, 1: 774.2. Samples: 1497081. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:42,209][36967] Avg episode reward: [(0, '9.630'), (1, '13.300')]
[2023-09-22 12:07:47,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 6017024. Throughput: 0: 775.3, 1: 775.2. Samples: 1501593. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:47,210][36967] Avg episode reward: [(0, '9.390'), (1, '13.030')]
[2023-09-22 12:07:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6045696. Throughput: 0: 771.7, 1: 772.3. Samples: 1511014. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:07:52,209][36967] Avg episode reward: [(0, '9.050'), (1, '12.590')]
[2023-09-22 12:07:53,891][38127] Updated weights for policy 1, policy_version 11840 (0.0016)
[2023-09-22 12:07:53,892][38126] Updated weights for policy 0, policy_version 11840 (0.0014)
[2023-09-22 12:07:57,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6078464. Throughput: 0: 773.5, 1: 773.6. Samples: 1520207. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:07:57,209][36967] Avg episode reward: [(0, '9.180'), (1, '12.540')]
[2023-09-22 12:08:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 6111232. Throughput: 0: 774.0, 1: 774.5. Samples: 1525184. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:08:02,209][36967] Avg episode reward: [(0, '9.220'), (1, '11.920')]
[2023-09-22 12:08:06,951][38126] Updated weights for policy 0, policy_version 12000 (0.0017)
[2023-09-22 12:08:06,951][38127] Updated weights for policy 1, policy_version 12000 (0.0017)
[2023-09-22 12:08:07,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6144000. Throughput: 0: 776.0, 1: 775.7. Samples: 1534388. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:07,210][36967] Avg episode reward: [(0, '9.880'), (1, '12.240')]
[2023-09-22 12:08:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6176768. Throughput: 0: 773.1, 1: 774.7. Samples: 1543982. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:12,209][36967] Avg episode reward: [(0, '9.360'), (1, '11.630')]
[2023-09-22 12:08:17,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6201344. Throughput: 0: 776.3, 1: 774.4. Samples: 1548409. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:17,210][36967] Avg episode reward: [(0, '9.980'), (1, '12.530')]
[2023-09-22 12:08:19,937][38126] Updated weights for policy 0, policy_version 12160 (0.0019)
[2023-09-22 12:08:19,937][38127] Updated weights for policy 1, policy_version 12160 (0.0017)
[2023-09-22 12:08:22,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6234112. Throughput: 0: 784.8, 1: 784.3. Samples: 1558218. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:22,210][36967] Avg episode reward: [(0, '10.170'), (1, '12.910')]
[2023-09-22 12:08:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 6266880. Throughput: 0: 783.1, 1: 781.6. Samples: 1567491. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:08:27,210][36967] Avg episode reward: [(0, '10.490'), (1, '12.690')]
[2023-09-22 12:08:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6299648. Throughput: 0: 785.0, 1: 786.3. Samples: 1572304. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:08:32,210][36967] Avg episode reward: [(0, '10.850'), (1, '13.220')]
[2023-09-22 12:08:33,022][38126] Updated weights for policy 0, policy_version 12320 (0.0016)
[2023-09-22 12:08:33,022][38127] Updated weights for policy 1, policy_version 12320 (0.0016)
[2023-09-22 12:08:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6332416. Throughput: 0: 783.0, 1: 782.3. Samples: 1581451. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:37,210][36967] Avg episode reward: [(0, '10.420'), (1, '13.250')]
[2023-09-22 12:08:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6365184. Throughput: 0: 788.6, 1: 787.8. Samples: 1591145. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:42,209][36967] Avg episode reward: [(0, '11.110'), (1, '13.730')]
[2023-09-22 12:08:42,210][37891] Saving new best policy, reward=13.730!
[2023-09-22 12:08:46,169][38126] Updated weights for policy 0, policy_version 12480 (0.0019)
[2023-09-22 12:08:46,169][38127] Updated weights for policy 1, policy_version 12480 (0.0014)
[2023-09-22 12:08:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 6389760. Throughput: 0: 781.9, 1: 781.6. Samples: 1595538. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:47,210][36967] Avg episode reward: [(0, '10.890'), (1, '14.170')]
[2023-09-22 12:08:47,211][37891] Saving new best policy, reward=14.170!
[2023-09-22 12:08:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6422528. Throughput: 0: 778.4, 1: 779.5. Samples: 1604496. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:08:52,209][36967] Avg episode reward: [(0, '12.070'), (1, '14.930')]
[2023-09-22 12:08:52,219][37819] Saving new best policy, reward=12.070!
[2023-09-22 12:08:52,219][37891] Saving new best policy, reward=14.930!
[2023-09-22 12:08:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6455296. Throughput: 0: 776.2, 1: 777.0. Samples: 1613876. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:08:57,209][36967] Avg episode reward: [(0, '11.700'), (1, '15.600')]
[2023-09-22 12:08:57,210][37891] Saving new best policy, reward=15.600!
[2023-09-22 12:08:59,513][38127] Updated weights for policy 1, policy_version 12640 (0.0016)
[2023-09-22 12:08:59,513][38126] Updated weights for policy 0, policy_version 12640 (0.0017)
[2023-09-22 12:09:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6488064. Throughput: 0: 778.8, 1: 778.6. Samples: 1618493. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:09:02,209][36967] Avg episode reward: [(0, '11.980'), (1, '14.410')]
[2023-09-22 12:09:07,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 6516736. Throughput: 0: 775.8, 1: 778.5. Samples: 1628161. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:07,210][36967] Avg episode reward: [(0, '11.570'), (1, '14.700')]
[2023-09-22 12:09:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6545408. Throughput: 0: 779.4, 1: 778.9. Samples: 1637611. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:12,209][36967] Avg episode reward: [(0, '11.620'), (1, '15.420')]
[2023-09-22 12:09:12,441][38127] Updated weights for policy 1, policy_version 12800 (0.0016)
[2023-09-22 12:09:12,441][38126] Updated weights for policy 0, policy_version 12800 (0.0016)
[2023-09-22 12:09:17,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6578176. Throughput: 0: 779.2, 1: 780.1. Samples: 1642471. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:17,209][36967] Avg episode reward: [(0, '11.170'), (1, '15.890')]
[2023-09-22 12:09:17,210][37891] Saving new best policy, reward=15.890!
[2023-09-22 12:09:22,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6610944. Throughput: 0: 781.5, 1: 781.4. Samples: 1651781. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:22,210][36967] Avg episode reward: [(0, '11.520'), (1, '16.340')]
[2023-09-22 12:09:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000012912_3305472.pth...
[2023-09-22 12:09:22,222][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000012912_3305472.pth...
[2023-09-22 12:09:22,253][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000009984_2555904.pth
[2023-09-22 12:09:22,257][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000009984_2555904.pth
[2023-09-22 12:09:22,257][37891] Saving new best policy, reward=16.340!
[2023-09-22 12:09:25,619][38127] Updated weights for policy 1, policy_version 12960 (0.0016)
[2023-09-22 12:09:25,619][38126] Updated weights for policy 0, policy_version 12960 (0.0016)
[2023-09-22 12:09:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6643712. Throughput: 0: 773.7, 1: 776.7. Samples: 1660913. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:27,210][36967] Avg episode reward: [(0, '10.730'), (1, '16.650')]
[2023-09-22 12:09:27,211][37891] Saving new best policy, reward=16.650!
[2023-09-22 12:09:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6668288. Throughput: 0: 773.9, 1: 773.6. Samples: 1665174. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:32,210][36967] Avg episode reward: [(0, '10.260'), (1, '16.960')]
[2023-09-22 12:09:32,308][37891] Saving new best policy, reward=16.960!
[2023-09-22 12:09:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6701056. Throughput: 0: 784.0, 1: 783.0. Samples: 1675011. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:37,210][36967] Avg episode reward: [(0, '10.020'), (1, '15.530')]
[2023-09-22 12:09:38,741][38127] Updated weights for policy 1, policy_version 13120 (0.0018)
[2023-09-22 12:09:38,742][38126] Updated weights for policy 0, policy_version 13120 (0.0015)
[2023-09-22 12:09:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 6733824. Throughput: 0: 782.8, 1: 781.4. Samples: 1684267. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:42,210][36967] Avg episode reward: [(0, '10.130'), (1, '14.780')]
[2023-09-22 12:09:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6766592. Throughput: 0: 783.8, 1: 783.8. Samples: 1689037. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:09:47,210][36967] Avg episode reward: [(0, '10.060'), (1, '15.120')]
[2023-09-22 12:09:51,970][38127] Updated weights for policy 1, policy_version 13280 (0.0019)
[2023-09-22 12:09:51,970][38126] Updated weights for policy 0, policy_version 13280 (0.0017)
[2023-09-22 12:09:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6799360. Throughput: 0: 779.8, 1: 776.9. Samples: 1698211. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:09:52,209][36967] Avg episode reward: [(0, '9.450'), (1, '15.230')]
[2023-09-22 12:09:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6823936. Throughput: 0: 777.0, 1: 776.9. Samples: 1707540. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:09:57,210][36967] Avg episode reward: [(0, '9.210'), (1, '15.060')]
[2023-09-22 12:10:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 6856704. Throughput: 0: 773.7, 1: 774.2. Samples: 1712128. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:02,209][36967] Avg episode reward: [(0, '9.700'), (1, '14.200')]
[2023-09-22 12:10:05,220][38126] Updated weights for policy 0, policy_version 13440 (0.0016)
[2023-09-22 12:10:05,221][38127] Updated weights for policy 1, policy_version 13440 (0.0016)
[2023-09-22 12:10:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 6889472. Throughput: 0: 774.3, 1: 773.8. Samples: 1721449. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:07,209][36967] Avg episode reward: [(0, '9.990'), (1, '15.370')]
[2023-09-22 12:10:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6922240. Throughput: 0: 774.3, 1: 774.0. Samples: 1730588. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:12,210][36967] Avg episode reward: [(0, '10.680'), (1, '15.420')]
[2023-09-22 12:10:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 6955008. Throughput: 0: 781.4, 1: 781.4. Samples: 1735504. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:10:17,209][36967] Avg episode reward: [(0, '10.910'), (1, '15.320')]
[2023-09-22 12:10:18,291][38127] Updated weights for policy 1, policy_version 13600 (0.0018)
[2023-09-22 12:10:18,291][38126] Updated weights for policy 0, policy_version 13600 (0.0016)
[2023-09-22 12:10:22,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 6983680. Throughput: 0: 775.2, 1: 777.9. Samples: 1744897. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:10:22,209][36967] Avg episode reward: [(0, '11.710'), (1, '15.250')]
[2023-09-22 12:10:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7012352. Throughput: 0: 779.4, 1: 779.5. Samples: 1754416. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:10:27,209][36967] Avg episode reward: [(0, '11.570'), (1, '17.230')]
[2023-09-22 12:10:27,400][37891] Saving new best policy, reward=17.230!
[2023-09-22 12:10:31,441][38126] Updated weights for policy 0, policy_version 13760 (0.0016)
[2023-09-22 12:10:31,442][38127] Updated weights for policy 1, policy_version 13760 (0.0017)
[2023-09-22 12:10:32,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7045120. Throughput: 0: 778.6, 1: 779.5. Samples: 1759153. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:32,210][36967] Avg episode reward: [(0, '11.620'), (1, '17.060')]
[2023-09-22 12:10:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7077888. Throughput: 0: 778.2, 1: 779.8. Samples: 1768324. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:37,209][36967] Avg episode reward: [(0, '11.590'), (1, '17.450')]
[2023-09-22 12:10:37,219][37891] Saving new best policy, reward=17.450!
[2023-09-22 12:10:42,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 7110656. Throughput: 0: 778.0, 1: 779.8. Samples: 1777640. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:42,209][36967] Avg episode reward: [(0, '10.940'), (1, '16.820')]
[2023-09-22 12:10:44,574][38127] Updated weights for policy 1, policy_version 13920 (0.0017)
[2023-09-22 12:10:44,575][38126] Updated weights for policy 0, policy_version 13920 (0.0018)
[2023-09-22 12:10:47,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 7139328. Throughput: 0: 780.2, 1: 777.8. Samples: 1782237. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:47,210][36967] Avg episode reward: [(0, '11.100'), (1, '17.630')]
[2023-09-22 12:10:47,236][37891] Saving new best policy, reward=17.630!
[2023-09-22 12:10:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7168000. Throughput: 0: 782.2, 1: 781.0. Samples: 1791790. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:52,209][36967] Avg episode reward: [(0, '11.510'), (1, '17.550')]
[2023-09-22 12:10:57,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7200768. Throughput: 0: 784.1, 1: 782.2. Samples: 1801072. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:10:57,209][36967] Avg episode reward: [(0, '11.250'), (1, '17.770')]
[2023-09-22 12:10:57,210][37891] Saving new best policy, reward=17.770!
[2023-09-22 12:10:57,688][38127] Updated weights for policy 1, policy_version 14080 (0.0019)
[2023-09-22 12:10:57,688][38126] Updated weights for policy 0, policy_version 14080 (0.0019)
[2023-09-22 12:11:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7233536. Throughput: 0: 781.8, 1: 782.2. Samples: 1805881. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:11:02,209][36967] Avg episode reward: [(0, '11.380'), (1, '17.640')]
[2023-09-22 12:11:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7266304. Throughput: 0: 779.0, 1: 776.7. Samples: 1814906. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:11:07,209][36967] Avg episode reward: [(0, '11.640'), (1, '17.490')]
[2023-09-22 12:11:10,798][38126] Updated weights for policy 0, policy_version 14240 (0.0015)
[2023-09-22 12:11:10,798][38127] Updated weights for policy 1, policy_version 14240 (0.0016)
[2023-09-22 12:11:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 7299072. Throughput: 0: 780.4, 1: 780.8. Samples: 1824669. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:11:12,209][36967] Avg episode reward: [(0, '11.750'), (1, '16.670')]
[2023-09-22 12:11:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7323648. Throughput: 0: 778.5, 1: 777.2. Samples: 1829160. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:11:17,210][36967] Avg episode reward: [(0, '11.800'), (1, '16.450')]
[2023-09-22 12:11:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 7356416. Throughput: 0: 783.9, 1: 781.1. Samples: 1838750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:11:22,209][36967] Avg episode reward: [(0, '12.280'), (1, '16.440')]
[2023-09-22 12:11:22,216][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000014368_3678208.pth...
[2023-09-22 12:11:22,216][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000014368_3678208.pth...
[2023-09-22 12:11:22,244][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000011456_2932736.pth
[2023-09-22 12:11:22,256][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000011456_2932736.pth
[2023-09-22 12:11:22,261][37819] Saving new best policy, reward=12.280!
[2023-09-22 12:11:23,893][38127] Updated weights for policy 1, policy_version 14400 (0.0017)
[2023-09-22 12:11:23,893][38126] Updated weights for policy 0, policy_version 14400 (0.0015)
[2023-09-22 12:11:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7389184. Throughput: 0: 781.0, 1: 778.6. Samples: 1847822. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:11:27,209][36967] Avg episode reward: [(0, '11.850'), (1, '15.560')]
[2023-09-22 12:11:32,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7421952. Throughput: 0: 779.3, 1: 780.0. Samples: 1852408. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:11:32,210][36967] Avg episode reward: [(0, '11.560'), (1, '15.630')]
[2023-09-22 12:11:37,172][38126] Updated weights for policy 0, policy_version 14560 (0.0015)
[2023-09-22 12:11:37,173][38127] Updated weights for policy 1, policy_version 14560 (0.0016)
[2023-09-22 12:11:37,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7454720. Throughput: 0: 774.1, 1: 778.1. Samples: 1861636. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:11:37,210][36967] Avg episode reward: [(0, '11.320'), (1, '16.110')]
[2023-09-22 12:11:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7479296. Throughput: 0: 781.0, 1: 781.0. Samples: 1871358. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:11:42,210][36967] Avg episode reward: [(0, '11.360'), (1, '16.790')]
[2023-09-22 12:11:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 7512064. Throughput: 0: 777.8, 1: 779.9. Samples: 1875978. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:11:47,210][36967] Avg episode reward: [(0, '11.560'), (1, '16.740')]
[2023-09-22 12:11:50,152][38126] Updated weights for policy 0, policy_version 14720 (0.0017)
[2023-09-22 12:11:50,152][38127] Updated weights for policy 1, policy_version 14720 (0.0017)
[2023-09-22 12:11:52,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 7544832. Throughput: 0: 783.6, 1: 783.2. Samples: 1885415. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:11:52,210][36967] Avg episode reward: [(0, '11.390'), (1, '16.570')]
[2023-09-22 12:11:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7577600. Throughput: 0: 774.1, 1: 775.8. Samples: 1894417. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:11:57,210][36967] Avg episode reward: [(0, '11.320'), (1, '16.430')]
[2023-09-22 12:12:02,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7610368. Throughput: 0: 777.5, 1: 777.4. Samples: 1899131. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:12:02,210][36967] Avg episode reward: [(0, '11.230'), (1, '16.730')]
[2023-09-22 12:12:03,370][38127] Updated weights for policy 1, policy_version 14880 (0.0017)
[2023-09-22 12:12:03,370][38126] Updated weights for policy 0, policy_version 14880 (0.0017)
[2023-09-22 12:12:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7634944. Throughput: 0: 775.6, 1: 779.3. Samples: 1908719. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:12:07,210][36967] Avg episode reward: [(0, '11.560'), (1, '16.210')]
[2023-09-22 12:12:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7667712. Throughput: 0: 778.7, 1: 779.4. Samples: 1917937. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:12,209][36967] Avg episode reward: [(0, '12.010'), (1, '16.270')]
[2023-09-22 12:12:16,454][38126] Updated weights for policy 0, policy_version 15040 (0.0016)
[2023-09-22 12:12:16,454][38127] Updated weights for policy 1, policy_version 15040 (0.0017)
[2023-09-22 12:12:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 7700480. Throughput: 0: 783.4, 1: 782.2. Samples: 1922860. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:17,209][36967] Avg episode reward: [(0, '11.800'), (1, '16.030')]
[2023-09-22 12:12:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 7733248. Throughput: 0: 785.8, 1: 783.6. Samples: 1932257. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:22,209][36967] Avg episode reward: [(0, '12.110'), (1, '15.170')]
[2023-09-22 12:12:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7766016. Throughput: 0: 778.5, 1: 780.6. Samples: 1941518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:27,210][36967] Avg episode reward: [(0, '11.750'), (1, '15.080')]
[2023-09-22 12:12:29,532][38126] Updated weights for policy 0, policy_version 15200 (0.0017)
[2023-09-22 12:12:29,532][38127] Updated weights for policy 1, policy_version 15200 (0.0016)
[2023-09-22 12:12:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 7798784. Throughput: 0: 781.1, 1: 778.2. Samples: 1946147. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:32,209][36967] Avg episode reward: [(0, '11.860'), (1, '14.950')]
[2023-09-22 12:12:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 7823360. Throughput: 0: 781.1, 1: 783.1. Samples: 1955807. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:37,210][36967] Avg episode reward: [(0, '11.190'), (1, '15.170')]
[2023-09-22 12:12:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 7856128. Throughput: 0: 786.6, 1: 785.1. Samples: 1965142. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:42,209][36967] Avg episode reward: [(0, '10.930'), (1, '15.070')]
[2023-09-22 12:12:42,576][38127] Updated weights for policy 1, policy_version 15360 (0.0017)
[2023-09-22 12:12:42,576][38126] Updated weights for policy 0, policy_version 15360 (0.0017)
[2023-09-22 12:12:47,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 7888896. Throughput: 0: 787.4, 1: 787.7. Samples: 1970010. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:47,209][36967] Avg episode reward: [(0, '10.860'), (1, '15.610')]
[2023-09-22 12:12:52,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7921664. Throughput: 0: 782.9, 1: 780.4. Samples: 1979066. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:52,210][36967] Avg episode reward: [(0, '11.040'), (1, '16.170')]
[2023-09-22 12:12:55,757][38127] Updated weights for policy 1, policy_version 15520 (0.0015)
[2023-09-22 12:12:55,758][38126] Updated weights for policy 0, policy_version 15520 (0.0016)
[2023-09-22 12:12:57,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 7954432. Throughput: 0: 784.1, 1: 785.0. Samples: 1988548. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:12:57,210][36967] Avg episode reward: [(0, '11.690'), (1, '16.900')]
[2023-09-22 12:13:02,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 7983104. Throughput: 0: 779.4, 1: 779.9. Samples: 1993029. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:13:02,210][36967] Avg episode reward: [(0, '11.720'), (1, '17.420')]
[2023-09-22 12:13:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8011776. Throughput: 0: 783.8, 1: 783.9. Samples: 2002807. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:13:07,210][36967] Avg episode reward: [(0, '12.240'), (1, '17.960')]
[2023-09-22 12:13:07,381][37891] Saving new best policy, reward=17.960!
[2023-09-22 12:13:08,681][38126] Updated weights for policy 0, policy_version 15680 (0.0018)
[2023-09-22 12:13:08,681][38127] Updated weights for policy 1, policy_version 15680 (0.0017)
[2023-09-22 12:13:12,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8044544. Throughput: 0: 783.6, 1: 781.4. Samples: 2011944. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:13:12,210][36967] Avg episode reward: [(0, '12.390'), (1, '18.390')]
[2023-09-22 12:13:12,211][37819] Saving new best policy, reward=12.390!
[2023-09-22 12:13:12,211][37891] Saving new best policy, reward=18.390!
[2023-09-22 12:13:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8077312. Throughput: 0: 784.0, 1: 784.5. Samples: 2016731. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:13:17,210][36967] Avg episode reward: [(0, '12.000'), (1, '19.110')]
[2023-09-22 12:13:17,210][37891] Saving new best policy, reward=19.110!
[2023-09-22 12:13:21,830][38127] Updated weights for policy 1, policy_version 15840 (0.0016)
[2023-09-22 12:13:21,831][38126] Updated weights for policy 0, policy_version 15840 (0.0016)
[2023-09-22 12:13:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8110080. Throughput: 0: 781.3, 1: 779.5. Samples: 2026042. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:13:22,209][36967] Avg episode reward: [(0, '11.440'), (1, '19.520')]
[2023-09-22 12:13:22,218][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000015840_4055040.pth...
[2023-09-22 12:13:22,218][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000015840_4055040.pth...
[2023-09-22 12:13:22,247][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000012912_3305472.pth
[2023-09-22 12:13:22,250][37891] Saving new best policy, reward=19.520!
[2023-09-22 12:13:22,255][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000012912_3305472.pth
[2023-09-22 12:13:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8142848. Throughput: 0: 782.8, 1: 782.4. Samples: 2035576. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:13:27,209][36967] Avg episode reward: [(0, '11.340'), (1, '19.650')]
[2023-09-22 12:13:27,210][37891] Saving new best policy, reward=19.650!
[2023-09-22 12:13:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 8167424. Throughput: 0: 777.0, 1: 777.1. Samples: 2039948. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:13:32,210][36967] Avg episode reward: [(0, '10.370'), (1, '19.310')]
[2023-09-22 12:13:34,862][38126] Updated weights for policy 0, policy_version 16000 (0.0018)
[2023-09-22 12:13:34,862][38127] Updated weights for policy 1, policy_version 16000 (0.0017)
[2023-09-22 12:13:37,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 8200192. Throughput: 0: 786.3, 1: 786.7. Samples: 2049851. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:13:37,210][36967] Avg episode reward: [(0, '10.920'), (1, '20.260')]
[2023-09-22 12:13:37,413][37891] Saving new best policy, reward=20.260!
[2023-09-22 12:13:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8232960. Throughput: 0: 783.7, 1: 783.2. Samples: 2059058. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:13:42,209][36967] Avg episode reward: [(0, '11.220'), (1, '21.300')]
[2023-09-22 12:13:42,210][37891] Saving new best policy, reward=21.300!
[2023-09-22 12:13:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8265728. Throughput: 0: 787.8, 1: 788.2. Samples: 2063950. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:13:47,210][36967] Avg episode reward: [(0, '11.420'), (1, '21.920')]
[2023-09-22 12:13:47,211][37891] Saving new best policy, reward=21.920!
[2023-09-22 12:13:47,874][38126] Updated weights for policy 0, policy_version 16160 (0.0016)
[2023-09-22 12:13:47,874][38127] Updated weights for policy 1, policy_version 16160 (0.0016)
[2023-09-22 12:13:52,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8298496. Throughput: 0: 783.5, 1: 783.1. Samples: 2073305. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:13:52,209][36967] Avg episode reward: [(0, '11.410'), (1, '21.030')]
[2023-09-22 12:13:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8331264. Throughput: 0: 786.2, 1: 788.0. Samples: 2082780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:13:57,209][36967] Avg episode reward: [(0, '11.780'), (1, '20.880')]
[2023-09-22 12:14:00,923][38127] Updated weights for policy 1, policy_version 16320 (0.0016)
[2023-09-22 12:14:00,924][38126] Updated weights for policy 0, policy_version 16320 (0.0017)
[2023-09-22 12:14:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6348.8, 300 sec: 6262.0). Total num frames: 8364032. Throughput: 0: 783.8, 1: 783.8. Samples: 2087273. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:02,210][36967] Avg episode reward: [(0, '11.940'), (1, '21.390')]
[2023-09-22 12:14:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8388608. Throughput: 0: 788.6, 1: 789.3. Samples: 2097044. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:14:07,210][36967] Avg episode reward: [(0, '12.430'), (1, '22.790')]
[2023-09-22 12:14:07,386][37819] Saving new best policy, reward=12.430!
[2023-09-22 12:14:07,405][37891] Saving new best policy, reward=22.790!
[2023-09-22 12:14:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8421376. Throughput: 0: 785.5, 1: 785.2. Samples: 2106257. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:14:12,209][36967] Avg episode reward: [(0, '12.710'), (1, '22.920')]
[2023-09-22 12:14:12,210][37819] Saving new best policy, reward=12.710!
[2023-09-22 12:14:12,210][37891] Saving new best policy, reward=22.920!
[2023-09-22 12:14:13,912][38127] Updated weights for policy 1, policy_version 16480 (0.0017)
[2023-09-22 12:14:13,912][38126] Updated weights for policy 0, policy_version 16480 (0.0017)
[2023-09-22 12:14:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8454144. Throughput: 0: 791.8, 1: 792.1. Samples: 2111224. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:17,210][36967] Avg episode reward: [(0, '13.980'), (1, '22.610')]
[2023-09-22 12:14:17,211][37819] Saving new best policy, reward=13.980!
[2023-09-22 12:14:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8486912. Throughput: 0: 781.3, 1: 781.4. Samples: 2120169. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:22,209][36967] Avg episode reward: [(0, '14.180'), (1, '22.310')]
[2023-09-22 12:14:22,215][37819] Saving new best policy, reward=14.180!
[2023-09-22 12:14:27,109][38126] Updated weights for policy 0, policy_version 16640 (0.0017)
[2023-09-22 12:14:27,109][38127] Updated weights for policy 1, policy_version 16640 (0.0017)
[2023-09-22 12:14:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8519680. Throughput: 0: 786.4, 1: 786.4. Samples: 2129831. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:27,210][36967] Avg episode reward: [(0, '14.950'), (1, '23.160')]
[2023-09-22 12:14:27,210][37891] Saving new best policy, reward=23.160!
[2023-09-22 12:14:27,210][37819] Saving new best policy, reward=14.950!
[2023-09-22 12:14:32,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8544256. Throughput: 0: 782.4, 1: 781.6. Samples: 2134330. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:32,210][36967] Avg episode reward: [(0, '15.540'), (1, '24.440')]
[2023-09-22 12:14:32,255][37891] Saving new best policy, reward=24.440!
[2023-09-22 12:14:32,264][37819] Saving new best policy, reward=15.540!
[2023-09-22 12:14:37,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8577024. Throughput: 0: 784.6, 1: 784.9. Samples: 2143931. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:37,209][36967] Avg episode reward: [(0, '16.630'), (1, '24.230')]
[2023-09-22 12:14:37,216][37819] Saving new best policy, reward=16.630!
[2023-09-22 12:14:40,148][38126] Updated weights for policy 0, policy_version 16800 (0.0015)
[2023-09-22 12:14:40,149][38127] Updated weights for policy 1, policy_version 16800 (0.0018)
[2023-09-22 12:14:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8609792. Throughput: 0: 783.5, 1: 781.3. Samples: 2153197. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:42,210][36967] Avg episode reward: [(0, '16.330'), (1, '23.580')]
[2023-09-22 12:14:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 8642560. Throughput: 0: 787.0, 1: 787.2. Samples: 2158108. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:47,209][36967] Avg episode reward: [(0, '17.400'), (1, '21.380')]
[2023-09-22 12:14:47,210][37819] Saving new best policy, reward=17.400!
[2023-09-22 12:14:52,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8675328. Throughput: 0: 778.8, 1: 778.7. Samples: 2167135. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:52,210][36967] Avg episode reward: [(0, '18.550'), (1, '22.010')]
[2023-09-22 12:14:52,219][37819] Saving new best policy, reward=18.550!
[2023-09-22 12:14:53,298][38126] Updated weights for policy 0, policy_version 16960 (0.0017)
[2023-09-22 12:14:53,298][38127] Updated weights for policy 1, policy_version 16960 (0.0017)
[2023-09-22 12:14:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8708096. Throughput: 0: 783.6, 1: 783.1. Samples: 2176759. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:14:57,209][36967] Avg episode reward: [(0, '19.120'), (1, '22.610')]
[2023-09-22 12:14:57,210][37819] Saving new best policy, reward=19.120!
[2023-09-22 12:15:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 8732672. Throughput: 0: 778.5, 1: 778.2. Samples: 2181275. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:02,209][36967] Avg episode reward: [(0, '18.060'), (1, '23.490')]
[2023-09-22 12:15:06,363][38126] Updated weights for policy 0, policy_version 17120 (0.0017)
[2023-09-22 12:15:06,364][38127] Updated weights for policy 1, policy_version 17120 (0.0018)
[2023-09-22 12:15:07,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8765440. Throughput: 0: 785.6, 1: 786.4. Samples: 2190913. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:07,210][36967] Avg episode reward: [(0, '17.350'), (1, '21.300')]
[2023-09-22 12:15:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8798208. Throughput: 0: 779.4, 1: 778.7. Samples: 2199947. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:12,210][36967] Avg episode reward: [(0, '16.300'), (1, '22.010')]
[2023-09-22 12:15:17,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6262.0). Total num frames: 8830976. Throughput: 0: 784.5, 1: 784.8. Samples: 2204948. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:17,209][36967] Avg episode reward: [(0, '14.180'), (1, '21.500')]
[2023-09-22 12:15:19,271][38126] Updated weights for policy 0, policy_version 17280 (0.0017)
[2023-09-22 12:15:19,272][38127] Updated weights for policy 1, policy_version 17280 (0.0016)
[2023-09-22 12:15:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8863744. Throughput: 0: 783.4, 1: 783.1. Samples: 2214424. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:22,209][36967] Avg episode reward: [(0, '13.260'), (1, '20.660')]
[2023-09-22 12:15:22,217][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000017312_4431872.pth...
[2023-09-22 12:15:22,217][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000017312_4431872.pth...
[2023-09-22 12:15:22,247][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000014368_3678208.pth
[2023-09-22 12:15:22,254][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000014368_3678208.pth
[2023-09-22 12:15:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 8896512. Throughput: 0: 785.8, 1: 786.3. Samples: 2223941. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:27,209][36967] Avg episode reward: [(0, '12.390'), (1, '19.000')]
[2023-09-22 12:15:32,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8921088. Throughput: 0: 778.7, 1: 780.3. Samples: 2228263. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:32,210][36967] Avg episode reward: [(0, '11.960'), (1, '18.830')]
[2023-09-22 12:15:32,481][38126] Updated weights for policy 0, policy_version 17440 (0.0018)
[2023-09-22 12:15:32,481][38127] Updated weights for policy 1, policy_version 17440 (0.0017)
[2023-09-22 12:15:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 8953856. Throughput: 0: 787.0, 1: 786.3. Samples: 2237934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:37,209][36967] Avg episode reward: [(0, '11.880'), (1, '18.700')]
[2023-09-22 12:15:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 8986624. Throughput: 0: 780.0, 1: 780.3. Samples: 2246972. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:42,210][36967] Avg episode reward: [(0, '11.650'), (1, '18.790')]
[2023-09-22 12:15:45,575][38126] Updated weights for policy 0, policy_version 17600 (0.0016)
[2023-09-22 12:15:45,575][38127] Updated weights for policy 1, policy_version 17600 (0.0015)
[2023-09-22 12:15:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9019392. Throughput: 0: 783.7, 1: 783.9. Samples: 2251818. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:47,209][36967] Avg episode reward: [(0, '12.720'), (1, '19.220')]
[2023-09-22 12:15:52,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9052160. Throughput: 0: 781.6, 1: 780.6. Samples: 2261211. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:52,209][36967] Avg episode reward: [(0, '12.780'), (1, '18.650')]
[2023-09-22 12:15:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9076736. Throughput: 0: 787.0, 1: 788.0. Samples: 2270821. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:15:57,210][36967] Avg episode reward: [(0, '14.300'), (1, '18.290')]
[2023-09-22 12:15:58,581][38127] Updated weights for policy 1, policy_version 17760 (0.0015)
[2023-09-22 12:15:58,581][38126] Updated weights for policy 0, policy_version 17760 (0.0018)
[2023-09-22 12:16:02,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9109504. Throughput: 0: 781.0, 1: 783.2. Samples: 2275336. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:02,210][36967] Avg episode reward: [(0, '14.840'), (1, '16.970')]
[2023-09-22 12:16:07,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9142272. Throughput: 0: 781.5, 1: 781.3. Samples: 2284750. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:07,209][36967] Avg episode reward: [(0, '15.780'), (1, '16.990')]
[2023-09-22 12:16:11,810][38127] Updated weights for policy 1, policy_version 17920 (0.0016)
[2023-09-22 12:16:11,810][38126] Updated weights for policy 0, policy_version 17920 (0.0017)
[2023-09-22 12:16:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9175040. Throughput: 0: 775.8, 1: 777.0. Samples: 2293821. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:12,210][36967] Avg episode reward: [(0, '15.570'), (1, '17.140')]
[2023-09-22 12:16:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9207808. Throughput: 0: 781.2, 1: 779.4. Samples: 2298489. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:17,209][36967] Avg episode reward: [(0, '15.030'), (1, '17.200')]
[2023-09-22 12:16:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9232384. Throughput: 0: 778.2, 1: 780.8. Samples: 2308090. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:22,209][36967] Avg episode reward: [(0, '15.270'), (1, '17.430')]
[2023-09-22 12:16:24,925][38127] Updated weights for policy 1, policy_version 18080 (0.0017)
[2023-09-22 12:16:24,926][38126] Updated weights for policy 0, policy_version 18080 (0.0016)
[2023-09-22 12:16:27,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9265152. Throughput: 0: 782.4, 1: 782.4. Samples: 2317387. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:16:27,210][36967] Avg episode reward: [(0, '15.310'), (1, '18.070')]
[2023-09-22 12:16:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9297920. Throughput: 0: 782.4, 1: 782.7. Samples: 2322249. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:16:32,209][36967] Avg episode reward: [(0, '15.540'), (1, '17.520')]
[2023-09-22 12:16:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9330688. Throughput: 0: 779.5, 1: 779.5. Samples: 2331363. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:16:37,210][36967] Avg episode reward: [(0, '15.280'), (1, '16.490')]
[2023-09-22 12:16:38,008][38127] Updated weights for policy 1, policy_version 18240 (0.0016)
[2023-09-22 12:16:38,009][38126] Updated weights for policy 0, policy_version 18240 (0.0019)
[2023-09-22 12:16:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9363456. Throughput: 0: 777.5, 1: 779.0. Samples: 2340862. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:16:42,210][36967] Avg episode reward: [(0, '15.550'), (1, '16.520')]
[2023-09-22 12:16:47,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9396224. Throughput: 0: 780.0, 1: 777.6. Samples: 2345426. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:16:47,209][36967] Avg episode reward: [(0, '15.800'), (1, '17.270')]
[2023-09-22 12:16:51,111][38127] Updated weights for policy 1, policy_version 18400 (0.0019)
[2023-09-22 12:16:51,112][38126] Updated weights for policy 0, policy_version 18400 (0.0019)
[2023-09-22 12:16:52,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9420800. Throughput: 0: 781.3, 1: 781.7. Samples: 2355085. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:16:52,209][36967] Avg episode reward: [(0, '14.630'), (1, '17.560')]
[2023-09-22 12:16:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9453568. Throughput: 0: 779.2, 1: 777.5. Samples: 2363874. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:16:57,209][36967] Avg episode reward: [(0, '15.320'), (1, '17.300')]
[2023-09-22 12:17:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9486336. Throughput: 0: 779.9, 1: 780.0. Samples: 2368682. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:17:02,209][36967] Avg episode reward: [(0, '15.510'), (1, '16.930')]
[2023-09-22 12:17:04,372][38127] Updated weights for policy 1, policy_version 18560 (0.0017)
[2023-09-22 12:17:04,372][38126] Updated weights for policy 0, policy_version 18560 (0.0016)
[2023-09-22 12:17:07,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9519104. Throughput: 0: 778.5, 1: 776.2. Samples: 2378055. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:17:07,210][36967] Avg episode reward: [(0, '15.670'), (1, '18.930')]
[2023-09-22 12:17:12,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9543680. Throughput: 0: 780.0, 1: 780.4. Samples: 2387606. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:17:12,210][36967] Avg episode reward: [(0, '15.540'), (1, '18.570')]
[2023-09-22 12:17:17,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9576448. Throughput: 0: 776.2, 1: 776.9. Samples: 2392141. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:17:17,209][36967] Avg episode reward: [(0, '15.910'), (1, '17.780')]
[2023-09-22 12:17:17,394][38126] Updated weights for policy 0, policy_version 18720 (0.0016)
[2023-09-22 12:17:17,394][38127] Updated weights for policy 1, policy_version 18720 (0.0016)
[2023-09-22 12:17:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9609216. Throughput: 0: 785.1, 1: 785.4. Samples: 2402033. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:17:22,210][36967] Avg episode reward: [(0, '16.270'), (1, '18.380')]
[2023-09-22 12:17:22,221][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000018768_4804608.pth...
[2023-09-22 12:17:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000018768_4804608.pth...
[2023-09-22 12:17:22,251][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000015840_4055040.pth
[2023-09-22 12:17:22,251][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000015840_4055040.pth
[2023-09-22 12:17:27,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9641984. Throughput: 0: 780.8, 1: 778.2. Samples: 2411018. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:17:27,210][36967] Avg episode reward: [(0, '15.560'), (1, '17.850')]
[2023-09-22 12:17:30,456][38127] Updated weights for policy 1, policy_version 18880 (0.0014)
[2023-09-22 12:17:30,456][38126] Updated weights for policy 0, policy_version 18880 (0.0014)
[2023-09-22 12:17:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9674752. Throughput: 0: 782.4, 1: 782.0. Samples: 2415823. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:17:32,210][36967] Avg episode reward: [(0, '15.060'), (1, '17.530')]
[2023-09-22 12:17:37,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 9707520. Throughput: 0: 776.7, 1: 776.7. Samples: 2424988. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:17:37,209][36967] Avg episode reward: [(0, '15.270'), (1, '16.720')]
[2023-09-22 12:17:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9732096. Throughput: 0: 784.0, 1: 784.9. Samples: 2434474. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:17:42,209][36967] Avg episode reward: [(0, '16.650'), (1, '16.700')]
[2023-09-22 12:17:43,753][38127] Updated weights for policy 1, policy_version 19040 (0.0014)
[2023-09-22 12:17:43,753][38126] Updated weights for policy 0, policy_version 19040 (0.0015)
[2023-09-22 12:17:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9764864. Throughput: 0: 781.9, 1: 782.7. Samples: 2439086. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:17:47,210][36967] Avg episode reward: [(0, '17.090'), (1, '17.760')]
[2023-09-22 12:17:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9797632. Throughput: 0: 780.4, 1: 780.3. Samples: 2448287. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:17:52,209][36967] Avg episode reward: [(0, '17.210'), (1, '17.980')]
[2023-09-22 12:17:56,809][38126] Updated weights for policy 0, policy_version 19200 (0.0017)
[2023-09-22 12:17:56,810][38127] Updated weights for policy 1, policy_version 19200 (0.0019)
[2023-09-22 12:17:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 9830400. Throughput: 0: 778.0, 1: 778.9. Samples: 2457668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:17:57,210][36967] Avg episode reward: [(0, '17.530'), (1, '17.060')]
[2023-09-22 12:18:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 9863168. Throughput: 0: 783.3, 1: 782.2. Samples: 2462587. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:18:02,210][36967] Avg episode reward: [(0, '18.140'), (1, '17.370')]
[2023-09-22 12:18:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 9887744. Throughput: 0: 775.5, 1: 777.9. Samples: 2471937. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:18:07,210][36967] Avg episode reward: [(0, '17.850'), (1, '17.200')]
[2023-09-22 12:18:10,137][38127] Updated weights for policy 1, policy_version 19360 (0.0013)
[2023-09-22 12:18:10,137][38126] Updated weights for policy 0, policy_version 19360 (0.0017)
[2023-09-22 12:18:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9920512. Throughput: 0: 776.3, 1: 776.4. Samples: 2480887. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:18:12,209][36967] Avg episode reward: [(0, '18.340'), (1, '16.580')]
[2023-09-22 12:18:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 9953280. Throughput: 0: 778.3, 1: 778.6. Samples: 2485885. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:18:17,210][36967] Avg episode reward: [(0, '19.110'), (1, '18.330')]
[2023-09-22 12:18:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 9986048. Throughput: 0: 779.0, 1: 779.1. Samples: 2495103. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:18:22,209][36967] Avg episode reward: [(0, '18.820'), (1, '18.720')]
[2023-09-22 12:18:23,115][38127] Updated weights for policy 1, policy_version 19520 (0.0018)
[2023-09-22 12:18:23,115][38126] Updated weights for policy 0, policy_version 19520 (0.0019)
[2023-09-22 12:18:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 10018816. Throughput: 0: 779.3, 1: 781.4. Samples: 2504704. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:18:27,210][36967] Avg episode reward: [(0, '18.910'), (1, '18.880')]
[2023-09-22 12:18:32,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 10043392. Throughput: 0: 779.1, 1: 778.1. Samples: 2509158. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:18:32,210][36967] Avg episode reward: [(0, '17.470'), (1, '18.990')]
[2023-09-22 12:18:36,165][38127] Updated weights for policy 1, policy_version 19680 (0.0016)
[2023-09-22 12:18:36,165][38126] Updated weights for policy 0, policy_version 19680 (0.0017)
[2023-09-22 12:18:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 10076160. Throughput: 0: 784.2, 1: 783.7. Samples: 2518841. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:18:37,210][36967] Avg episode reward: [(0, '16.760'), (1, '19.820')]
[2023-09-22 12:18:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10108928. Throughput: 0: 782.8, 1: 781.2. Samples: 2528049. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:18:42,210][36967] Avg episode reward: [(0, '17.150'), (1, '19.470')]
[2023-09-22 12:18:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10141696. Throughput: 0: 778.9, 1: 779.3. Samples: 2532707. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:18:47,210][36967] Avg episode reward: [(0, '16.570'), (1, '20.140')]
[2023-09-22 12:18:49,441][38126] Updated weights for policy 0, policy_version 19840 (0.0018)
[2023-09-22 12:18:49,442][38127] Updated weights for policy 1, policy_version 19840 (0.0019)
[2023-09-22 12:18:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10174464. Throughput: 0: 775.8, 1: 773.8. Samples: 2541668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:18:52,210][36967] Avg episode reward: [(0, '16.880'), (1, '19.820')]
[2023-09-22 12:18:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 10199040. Throughput: 0: 782.2, 1: 782.2. Samples: 2551283. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:18:57,210][36967] Avg episode reward: [(0, '16.880'), (1, '19.080')]
[2023-09-22 12:19:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 10231808. Throughput: 0: 776.8, 1: 779.2. Samples: 2555905. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:19:02,210][36967] Avg episode reward: [(0, '15.670'), (1, '20.680')]
[2023-09-22 12:19:02,519][38127] Updated weights for policy 1, policy_version 20000 (0.0015)
[2023-09-22 12:19:02,519][38126] Updated weights for policy 0, policy_version 20000 (0.0017)
[2023-09-22 12:19:07,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10264576. Throughput: 0: 781.9, 1: 782.5. Samples: 2565499. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:19:07,210][36967] Avg episode reward: [(0, '15.020'), (1, '21.150')]
[2023-09-22 12:19:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10297344. Throughput: 0: 780.2, 1: 777.3. Samples: 2574792. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:12,209][36967] Avg episode reward: [(0, '14.580'), (1, '23.320')]
[2023-09-22 12:19:15,440][38126] Updated weights for policy 0, policy_version 20160 (0.0015)
[2023-09-22 12:19:15,441][38127] Updated weights for policy 1, policy_version 20160 (0.0016)
[2023-09-22 12:19:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10330112. Throughput: 0: 783.6, 1: 783.7. Samples: 2579688. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:17,210][36967] Avg episode reward: [(0, '14.920'), (1, '23.200')]
[2023-09-22 12:19:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10362880. Throughput: 0: 778.3, 1: 778.7. Samples: 2588908. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:22,210][36967] Avg episode reward: [(0, '14.750'), (1, '21.510')]
[2023-09-22 12:19:22,221][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000020240_5181440.pth...
[2023-09-22 12:19:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000020240_5181440.pth...
[2023-09-22 12:19:22,250][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000017312_4431872.pth
[2023-09-22 12:19:22,256][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000017312_4431872.pth
[2023-09-22 12:19:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 10395648. Throughput: 0: 785.1, 1: 785.4. Samples: 2598719. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:19:27,209][36967] Avg episode reward: [(0, '14.640'), (1, '21.910')]
[2023-09-22 12:19:28,435][38127] Updated weights for policy 1, policy_version 20320 (0.0018)
[2023-09-22 12:19:28,435][38126] Updated weights for policy 0, policy_version 20320 (0.0017)
[2023-09-22 12:19:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10420224. Throughput: 0: 781.4, 1: 782.3. Samples: 2603073. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:19:32,209][36967] Avg episode reward: [(0, '13.400'), (1, '22.030')]
[2023-09-22 12:19:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 10452992. Throughput: 0: 788.8, 1: 788.8. Samples: 2612659. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:19:37,209][36967] Avg episode reward: [(0, '14.330'), (1, '20.400')]
[2023-09-22 12:19:41,619][38126] Updated weights for policy 0, policy_version 20480 (0.0017)
[2023-09-22 12:19:41,620][38127] Updated weights for policy 1, policy_version 20480 (0.0018)
[2023-09-22 12:19:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 10485760. Throughput: 0: 784.0, 1: 783.5. Samples: 2621819. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:42,209][36967] Avg episode reward: [(0, '15.450'), (1, '21.140')]
[2023-09-22 12:19:47,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10518528. Throughput: 0: 786.8, 1: 784.4. Samples: 2626609. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:47,210][36967] Avg episode reward: [(0, '15.360'), (1, '21.280')]
[2023-09-22 12:19:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 10543104. Throughput: 0: 780.0, 1: 781.8. Samples: 2635780. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:19:52,209][36967] Avg episode reward: [(0, '15.050'), (1, '20.960')]
[2023-09-22 12:19:55,068][38126] Updated weights for policy 0, policy_version 20640 (0.0017)
[2023-09-22 12:19:55,068][38127] Updated weights for policy 1, policy_version 20640 (0.0015)
[2023-09-22 12:19:57,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 10575872. Throughput: 0: 774.8, 1: 775.1. Samples: 2644540. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:19:57,209][36967] Avg episode reward: [(0, '15.560'), (1, '19.450')]
[2023-09-22 12:20:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 10608640. Throughput: 0: 776.2, 1: 775.5. Samples: 2649513. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:02,209][36967] Avg episode reward: [(0, '15.660'), (1, '17.470')]
[2023-09-22 12:20:07,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10641408. Throughput: 0: 775.0, 1: 774.9. Samples: 2658654. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:07,210][36967] Avg episode reward: [(0, '15.680'), (1, '17.990')]
[2023-09-22 12:20:08,185][38127] Updated weights for policy 1, policy_version 20800 (0.0015)
[2023-09-22 12:20:08,186][38126] Updated weights for policy 0, policy_version 20800 (0.0017)
[2023-09-22 12:20:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10674176. Throughput: 0: 773.7, 1: 774.2. Samples: 2668375. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:12,209][36967] Avg episode reward: [(0, '14.950'), (1, '17.390')]
[2023-09-22 12:20:17,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 10698752. Throughput: 0: 774.7, 1: 773.8. Samples: 2672756. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:17,210][36967] Avg episode reward: [(0, '15.530'), (1, '18.170')]
[2023-09-22 12:20:21,164][38126] Updated weights for policy 0, policy_version 20960 (0.0015)
[2023-09-22 12:20:21,166][38127] Updated weights for policy 1, policy_version 20960 (0.0013)
[2023-09-22 12:20:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 10731520. Throughput: 0: 778.1, 1: 777.4. Samples: 2682659. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:22,209][36967] Avg episode reward: [(0, '15.550'), (1, '19.120')]
[2023-09-22 12:20:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 10764288. Throughput: 0: 777.7, 1: 778.2. Samples: 2691835. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:20:27,209][36967] Avg episode reward: [(0, '14.760'), (1, '19.030')]
[2023-09-22 12:20:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10797056. Throughput: 0: 780.4, 1: 780.6. Samples: 2696855. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:20:32,210][36967] Avg episode reward: [(0, '15.180'), (1, '20.660')]
[2023-09-22 12:20:34,151][38127] Updated weights for policy 1, policy_version 21120 (0.0015)
[2023-09-22 12:20:34,152][38126] Updated weights for policy 0, policy_version 21120 (0.0016)
[2023-09-22 12:20:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10829824. Throughput: 0: 782.3, 1: 780.3. Samples: 2706096. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:20:37,210][36967] Avg episode reward: [(0, '15.280'), (1, '20.320')]
[2023-09-22 12:20:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10862592. Throughput: 0: 788.8, 1: 789.2. Samples: 2715546. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:42,210][36967] Avg episode reward: [(0, '14.660'), (1, '20.670')]
[2023-09-22 12:20:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 10887168. Throughput: 0: 783.9, 1: 784.8. Samples: 2720102. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:47,210][36967] Avg episode reward: [(0, '15.050'), (1, '22.270')]
[2023-09-22 12:20:47,242][38126] Updated weights for policy 0, policy_version 21280 (0.0015)
[2023-09-22 12:20:47,242][38127] Updated weights for policy 1, policy_version 21280 (0.0014)
[2023-09-22 12:20:52,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 10919936. Throughput: 0: 789.2, 1: 789.6. Samples: 2729699. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:52,209][36967] Avg episode reward: [(0, '15.220'), (1, '22.330')]
[2023-09-22 12:20:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10952704. Throughput: 0: 780.2, 1: 779.9. Samples: 2738582. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:20:57,210][36967] Avg episode reward: [(0, '15.130'), (1, '21.620')]
[2023-09-22 12:21:00,665][38126] Updated weights for policy 0, policy_version 21440 (0.0019)
[2023-09-22 12:21:00,665][38127] Updated weights for policy 1, policy_version 21440 (0.0018)
[2023-09-22 12:21:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 10985472. Throughput: 0: 783.6, 1: 782.6. Samples: 2743234. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:02,210][36967] Avg episode reward: [(0, '15.220'), (1, '21.330')]
[2023-09-22 12:21:07,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 11014144. Throughput: 0: 774.9, 1: 777.5. Samples: 2752516. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:07,210][36967] Avg episode reward: [(0, '15.070'), (1, '20.710')]
[2023-09-22 12:21:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11042816. Throughput: 0: 780.0, 1: 780.1. Samples: 2762039. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:21:12,210][36967] Avg episode reward: [(0, '14.770'), (1, '21.810')]
[2023-09-22 12:21:13,722][38126] Updated weights for policy 0, policy_version 21600 (0.0017)
[2023-09-22 12:21:13,722][38127] Updated weights for policy 1, policy_version 21600 (0.0019)
[2023-09-22 12:21:17,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 11075584. Throughput: 0: 776.6, 1: 778.4. Samples: 2766829. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:21:17,209][36967] Avg episode reward: [(0, '15.150'), (1, '22.180')]
[2023-09-22 12:21:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11108352. Throughput: 0: 779.6, 1: 779.2. Samples: 2776244. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:21:22,209][36967] Avg episode reward: [(0, '15.430'), (1, '21.880')]
[2023-09-22 12:21:22,217][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000021696_5554176.pth...
[2023-09-22 12:21:22,218][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000021696_5554176.pth...
[2023-09-22 12:21:22,258][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000018768_4804608.pth
[2023-09-22 12:21:22,261][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000018768_4804608.pth
[2023-09-22 12:21:26,721][38126] Updated weights for policy 0, policy_version 21760 (0.0016)
[2023-09-22 12:21:26,721][38127] Updated weights for policy 1, policy_version 21760 (0.0013)
[2023-09-22 12:21:27,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11141120. Throughput: 0: 777.1, 1: 776.6. Samples: 2785466. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:27,210][36967] Avg episode reward: [(0, '15.450'), (1, '21.820')]
[2023-09-22 12:21:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11173888. Throughput: 0: 778.2, 1: 778.0. Samples: 2790132. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:32,210][36967] Avg episode reward: [(0, '15.670'), (1, '21.830')]
[2023-09-22 12:21:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11198464. Throughput: 0: 775.6, 1: 777.7. Samples: 2799601. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:37,210][36967] Avg episode reward: [(0, '15.410'), (1, '21.880')]
[2023-09-22 12:21:39,962][38126] Updated weights for policy 0, policy_version 21920 (0.0016)
[2023-09-22 12:21:39,962][38127] Updated weights for policy 1, policy_version 21920 (0.0015)
[2023-09-22 12:21:42,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11231232. Throughput: 0: 780.0, 1: 780.0. Samples: 2808780. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:42,210][36967] Avg episode reward: [(0, '14.140'), (1, '20.520')]
[2023-09-22 12:21:47,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 11264000. Throughput: 0: 782.9, 1: 783.4. Samples: 2813717. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:47,209][36967] Avg episode reward: [(0, '14.150'), (1, '20.710')]
[2023-09-22 12:21:52,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11296768. Throughput: 0: 783.5, 1: 781.0. Samples: 2822922. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:21:52,210][36967] Avg episode reward: [(0, '13.600'), (1, '21.380')]
[2023-09-22 12:21:52,986][38127] Updated weights for policy 1, policy_version 22080 (0.0018)
[2023-09-22 12:21:52,986][38126] Updated weights for policy 0, policy_version 22080 (0.0016)
[2023-09-22 12:21:57,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11329536. Throughput: 0: 781.9, 1: 782.8. Samples: 2832450. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:21:57,210][36967] Avg episode reward: [(0, '13.650'), (1, '21.810')]
[2023-09-22 12:22:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11362304. Throughput: 0: 783.6, 1: 782.2. Samples: 2837291. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:22:02,210][36967] Avg episode reward: [(0, '13.740'), (1, '22.430')]
[2023-09-22 12:22:05,977][38127] Updated weights for policy 1, policy_version 22240 (0.0016)
[2023-09-22 12:22:05,977][38126] Updated weights for policy 0, policy_version 22240 (0.0014)
[2023-09-22 12:22:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 11386880. Throughput: 0: 781.8, 1: 784.3. Samples: 2846720. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:22:07,210][36967] Avg episode reward: [(0, '14.460'), (1, '22.790')]
[2023-09-22 12:22:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11419648. Throughput: 0: 782.1, 1: 783.0. Samples: 2855893. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:22:12,210][36967] Avg episode reward: [(0, '14.190'), (1, '22.530')]
[2023-09-22 12:22:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11452416. Throughput: 0: 783.8, 1: 784.4. Samples: 2860699. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:17,210][36967] Avg episode reward: [(0, '14.010'), (1, '22.500')]
[2023-09-22 12:22:19,135][38127] Updated weights for policy 1, policy_version 22400 (0.0016)
[2023-09-22 12:22:19,136][38126] Updated weights for policy 0, policy_version 22400 (0.0016)
[2023-09-22 12:22:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 11485184. Throughput: 0: 783.6, 1: 781.6. Samples: 2870034. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:22,209][36967] Avg episode reward: [(0, '14.220'), (1, '21.360')]
[2023-09-22 12:22:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 11517952. Throughput: 0: 784.5, 1: 786.8. Samples: 2879489. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:27,209][36967] Avg episode reward: [(0, '14.560'), (1, '21.120')]
[2023-09-22 12:22:32,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11542528. Throughput: 0: 781.3, 1: 781.9. Samples: 2884058. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:32,209][36967] Avg episode reward: [(0, '15.720'), (1, '20.230')]
[2023-09-22 12:22:32,426][38126] Updated weights for policy 0, policy_version 22560 (0.0016)
[2023-09-22 12:22:32,426][38127] Updated weights for policy 1, policy_version 22560 (0.0016)
[2023-09-22 12:22:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 11575296. Throughput: 0: 777.7, 1: 778.1. Samples: 2892932. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:37,209][36967] Avg episode reward: [(0, '16.260'), (1, '20.390')]
[2023-09-22 12:22:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11599872. Throughput: 0: 769.7, 1: 768.4. Samples: 2901665. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:42,209][36967] Avg episode reward: [(0, '15.790'), (1, '20.510')]
[2023-09-22 12:22:46,661][38127] Updated weights for policy 1, policy_version 22720 (0.0014)
[2023-09-22 12:22:46,662][38126] Updated weights for policy 0, policy_version 22720 (0.0016)
[2023-09-22 12:22:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11632640. Throughput: 0: 763.8, 1: 764.3. Samples: 2906056. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:47,209][36967] Avg episode reward: [(0, '16.270'), (1, '20.650')]
[2023-09-22 12:22:52,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6007.5, 300 sec: 6192.6). Total num frames: 11657216. Throughput: 0: 751.6, 1: 750.9. Samples: 2914336. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:52,210][36967] Avg episode reward: [(0, '15.030'), (1, '19.920')]
[2023-09-22 12:22:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6192.6). Total num frames: 11689984. Throughput: 0: 751.3, 1: 750.6. Samples: 2923480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:22:57,209][36967] Avg episode reward: [(0, '14.980'), (1, '20.180')]
[2023-09-22 12:23:00,353][38126] Updated weights for policy 0, policy_version 22880 (0.0012)
[2023-09-22 12:23:00,354][38127] Updated weights for policy 1, policy_version 22880 (0.0014)
[2023-09-22 12:23:02,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6007.5, 300 sec: 6220.4). Total num frames: 11722752. Throughput: 0: 749.0, 1: 748.5. Samples: 2928085. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:23:02,209][36967] Avg episode reward: [(0, '15.800'), (1, '20.990')]
[2023-09-22 12:23:07,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11755520. Throughput: 0: 743.1, 1: 743.5. Samples: 2936932. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:23:07,210][36967] Avg episode reward: [(0, '16.760'), (1, '21.610')]
[2023-09-22 12:23:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6007.5, 300 sec: 6192.6). Total num frames: 11780096. Throughput: 0: 745.3, 1: 742.6. Samples: 2946445. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:23:12,209][36967] Avg episode reward: [(0, '17.700'), (1, '21.110')]
[2023-09-22 12:23:13,752][38126] Updated weights for policy 0, policy_version 23040 (0.0014)
[2023-09-22 12:23:13,752][38127] Updated weights for policy 1, policy_version 23040 (0.0014)
[2023-09-22 12:23:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6007.5, 300 sec: 6192.6). Total num frames: 11812864. Throughput: 0: 744.6, 1: 745.7. Samples: 2951119. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:23:17,210][36967] Avg episode reward: [(0, '17.110'), (1, '19.900')]
[2023-09-22 12:23:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6007.5, 300 sec: 6192.6). Total num frames: 11845632. Throughput: 0: 750.9, 1: 750.8. Samples: 2960511. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:23:22,210][36967] Avg episode reward: [(0, '15.790'), (1, '20.650')]
[2023-09-22 12:23:22,220][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000023136_5922816.pth...
[2023-09-22 12:23:22,220][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000023136_5922816.pth...
[2023-09-22 12:23:22,256][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000020240_5181440.pth
[2023-09-22 12:23:22,257][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000020240_5181440.pth
[2023-09-22 12:23:26,852][38126] Updated weights for policy 0, policy_version 23200 (0.0014)
[2023-09-22 12:23:26,853][38127] Updated weights for policy 1, policy_version 23200 (0.0015)
[2023-09-22 12:23:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6007.5, 300 sec: 6220.4). Total num frames: 11878400. Throughput: 0: 754.0, 1: 756.2. Samples: 2969627. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:23:27,209][36967] Avg episode reward: [(0, '16.220'), (1, '20.740')]
[2023-09-22 12:23:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 11911168. Throughput: 0: 759.0, 1: 758.2. Samples: 2974329. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:23:32,210][36967] Avg episode reward: [(0, '16.300'), (1, '20.580')]
[2023-09-22 12:23:37,209][36967] Fps is (10 sec: 6143.8, 60 sec: 6075.7, 300 sec: 6206.5). Total num frames: 11939840. Throughput: 0: 773.1, 1: 773.7. Samples: 2983940. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:23:37,210][36967] Avg episode reward: [(0, '16.980'), (1, '20.350')]
[2023-09-22 12:23:39,836][38126] Updated weights for policy 0, policy_version 23360 (0.0015)
[2023-09-22 12:23:39,836][38127] Updated weights for policy 1, policy_version 23360 (0.0016)
[2023-09-22 12:23:42,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 11968512. Throughput: 0: 777.2, 1: 778.0. Samples: 2993462. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:23:42,209][36967] Avg episode reward: [(0, '16.690'), (1, '21.820')]
[2023-09-22 12:23:47,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 12001280. Throughput: 0: 778.6, 1: 781.1. Samples: 2998272. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:23:47,210][36967] Avg episode reward: [(0, '16.660'), (1, '22.380')]
[2023-09-22 12:23:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12034048. Throughput: 0: 785.3, 1: 784.9. Samples: 3007594. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:23:52,209][36967] Avg episode reward: [(0, '17.050'), (1, '22.410')]
[2023-09-22 12:23:52,913][38127] Updated weights for policy 1, policy_version 23520 (0.0016)
[2023-09-22 12:23:52,913][38126] Updated weights for policy 0, policy_version 23520 (0.0017)
[2023-09-22 12:23:57,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12066816. Throughput: 0: 782.4, 1: 782.5. Samples: 3016866. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:23:57,209][36967] Avg episode reward: [(0, '16.540'), (1, '22.080')]
[2023-09-22 12:24:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12099584. Throughput: 0: 785.0, 1: 783.5. Samples: 3021701. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:24:02,210][36967] Avg episode reward: [(0, '16.890'), (1, '22.480')]
[2023-09-22 12:24:05,913][38127] Updated weights for policy 1, policy_version 23680 (0.0017)
[2023-09-22 12:24:05,913][38126] Updated weights for policy 0, policy_version 23680 (0.0016)
[2023-09-22 12:24:07,209][36967] Fps is (10 sec: 6143.8, 60 sec: 6212.3, 300 sec: 6206.5). Total num frames: 12128256. Throughput: 0: 782.6, 1: 784.8. Samples: 3031044. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:24:07,210][36967] Avg episode reward: [(0, '17.390'), (1, '22.550')]
[2023-09-22 12:24:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 12156928. Throughput: 0: 788.5, 1: 787.0. Samples: 3040522. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:24:12,209][36967] Avg episode reward: [(0, '16.920'), (1, '21.900')]
[2023-09-22 12:24:17,209][36967] Fps is (10 sec: 6144.2, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 12189696. Throughput: 0: 788.4, 1: 790.4. Samples: 3045374. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:24:17,209][36967] Avg episode reward: [(0, '17.570'), (1, '22.600')]
[2023-09-22 12:24:19,004][38127] Updated weights for policy 1, policy_version 23840 (0.0014)
[2023-09-22 12:24:19,004][38126] Updated weights for policy 0, policy_version 23840 (0.0016)
[2023-09-22 12:24:22,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 12222464. Throughput: 0: 784.8, 1: 782.5. Samples: 3054469. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:24:22,210][36967] Avg episode reward: [(0, '17.140'), (1, '22.890')]
[2023-09-22 12:24:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12255232. Throughput: 0: 780.8, 1: 782.6. Samples: 3063813. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:24:27,209][36967] Avg episode reward: [(0, '17.550'), (1, '22.360')]
[2023-09-22 12:24:32,084][38127] Updated weights for policy 1, policy_version 24000 (0.0015)
[2023-09-22 12:24:32,084][38126] Updated weights for policy 0, policy_version 24000 (0.0018)
[2023-09-22 12:24:32,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12288000. Throughput: 0: 781.2, 1: 778.7. Samples: 3068471. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:24:32,210][36967] Avg episode reward: [(0, '17.540'), (1, '23.690')]
[2023-09-22 12:24:37,209][36967] Fps is (10 sec: 6143.7, 60 sec: 6280.5, 300 sec: 6206.5). Total num frames: 12316672. Throughput: 0: 782.7, 1: 785.1. Samples: 3078148. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:24:37,210][36967] Avg episode reward: [(0, '16.060'), (1, '23.490')]
[2023-09-22 12:24:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6192.6). Total num frames: 12345344. Throughput: 0: 787.1, 1: 787.2. Samples: 3087708. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:24:42,210][36967] Avg episode reward: [(0, '15.420'), (1, '23.130')]
[2023-09-22 12:24:44,960][38126] Updated weights for policy 0, policy_version 24160 (0.0019)
[2023-09-22 12:24:44,960][38127] Updated weights for policy 1, policy_version 24160 (0.0017)
[2023-09-22 12:24:47,209][36967] Fps is (10 sec: 6144.3, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12378112. Throughput: 0: 785.2, 1: 787.7. Samples: 3092480. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:24:47,209][36967] Avg episode reward: [(0, '14.690'), (1, '21.200')]
[2023-09-22 12:24:52,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12410880. Throughput: 0: 788.1, 1: 785.6. Samples: 3101860. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:24:52,210][36967] Avg episode reward: [(0, '15.430'), (1, '20.380')]
[2023-09-22 12:24:57,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12443648. Throughput: 0: 785.0, 1: 784.8. Samples: 3111163. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:24:57,210][36967] Avg episode reward: [(0, '15.040'), (1, '18.560')]
[2023-09-22 12:24:57,967][38127] Updated weights for policy 1, policy_version 24320 (0.0019)
[2023-09-22 12:24:57,967][38126] Updated weights for policy 0, policy_version 24320 (0.0019)
[2023-09-22 12:25:02,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12476416. Throughput: 0: 785.0, 1: 783.6. Samples: 3115964. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:02,210][36967] Avg episode reward: [(0, '15.580'), (1, '17.620')]
[2023-09-22 12:25:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6348.8, 300 sec: 6220.4). Total num frames: 12509184. Throughput: 0: 786.7, 1: 787.6. Samples: 3125314. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:07,210][36967] Avg episode reward: [(0, '15.150'), (1, '18.210')]
[2023-09-22 12:25:10,977][38127] Updated weights for policy 1, policy_version 24480 (0.0017)
[2023-09-22 12:25:10,977][38126] Updated weights for policy 0, policy_version 24480 (0.0018)
[2023-09-22 12:25:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12533760. Throughput: 0: 792.8, 1: 790.6. Samples: 3135066. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:12,209][36967] Avg episode reward: [(0, '15.380'), (1, '19.060')]
[2023-09-22 12:25:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12566528. Throughput: 0: 790.5, 1: 791.5. Samples: 3139662. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:17,210][36967] Avg episode reward: [(0, '15.900'), (1, '19.330')]
[2023-09-22 12:25:22,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12599296. Throughput: 0: 791.2, 1: 789.1. Samples: 3149260. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:22,211][36967] Avg episode reward: [(0, '15.940'), (1, '18.190')]
[2023-09-22 12:25:22,220][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000024608_6299648.pth...
[2023-09-22 12:25:22,220][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000024608_6299648.pth...
[2023-09-22 12:25:22,263][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000021696_5554176.pth
[2023-09-22 12:25:22,263][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000021696_5554176.pth
[2023-09-22 12:25:24,049][38127] Updated weights for policy 1, policy_version 24640 (0.0017)
[2023-09-22 12:25:24,050][38126] Updated weights for policy 0, policy_version 24640 (0.0016)
[2023-09-22 12:25:27,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12632064. Throughput: 0: 782.3, 1: 782.6. Samples: 3158130. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:25:27,209][36967] Avg episode reward: [(0, '16.010'), (1, '18.170')]
[2023-09-22 12:25:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12664832. Throughput: 0: 784.9, 1: 782.3. Samples: 3163006. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:25:32,209][36967] Avg episode reward: [(0, '16.690'), (1, '18.330')]
[2023-09-22 12:25:37,177][38126] Updated weights for policy 0, policy_version 24800 (0.0015)
[2023-09-22 12:25:37,177][38127] Updated weights for policy 1, policy_version 24800 (0.0016)
[2023-09-22 12:25:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6348.8, 300 sec: 6220.4). Total num frames: 12697600. Throughput: 0: 782.3, 1: 784.6. Samples: 3172368. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:25:37,210][36967] Avg episode reward: [(0, '17.440'), (1, '17.690')]
[2023-09-22 12:25:42,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12722176. Throughput: 0: 787.3, 1: 787.0. Samples: 3182004. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:25:42,210][36967] Avg episode reward: [(0, '18.610'), (1, '19.090')]
[2023-09-22 12:25:47,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12754944. Throughput: 0: 785.2, 1: 786.5. Samples: 3186692. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:25:47,209][36967] Avg episode reward: [(0, '19.060'), (1, '19.450')]
[2023-09-22 12:25:50,043][38127] Updated weights for policy 1, policy_version 24960 (0.0016)
[2023-09-22 12:25:50,044][38126] Updated weights for policy 0, policy_version 24960 (0.0017)
[2023-09-22 12:25:52,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12787712. Throughput: 0: 790.2, 1: 789.2. Samples: 3196388. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:25:52,209][36967] Avg episode reward: [(0, '20.620'), (1, '21.400')]
[2023-09-22 12:25:52,219][37819] Saving new best policy, reward=20.620!
[2023-09-22 12:25:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12820480. Throughput: 0: 781.6, 1: 781.1. Samples: 3205386. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:25:57,209][36967] Avg episode reward: [(0, '20.040'), (1, '22.490')]
[2023-09-22 12:26:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6234.3). Total num frames: 12853248. Throughput: 0: 785.3, 1: 784.3. Samples: 3210292. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:26:02,210][36967] Avg episode reward: [(0, '20.550'), (1, '22.430')]
[2023-09-22 12:26:03,207][38126] Updated weights for policy 0, policy_version 25120 (0.0017)
[2023-09-22 12:26:03,207][38127] Updated weights for policy 1, policy_version 25120 (0.0018)
[2023-09-22 12:26:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 12886016. Throughput: 0: 781.6, 1: 781.3. Samples: 3219586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:07,209][36967] Avg episode reward: [(0, '21.670'), (1, '22.750')]
[2023-09-22 12:26:07,218][37819] Saving new best policy, reward=21.670!
[2023-09-22 12:26:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12910592. Throughput: 0: 788.2, 1: 787.5. Samples: 3229039. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:12,210][36967] Avg episode reward: [(0, '22.340'), (1, '22.450')]
[2023-09-22 12:26:12,355][37819] Saving new best policy, reward=22.340!
[2023-09-22 12:26:16,279][38127] Updated weights for policy 1, policy_version 25280 (0.0015)
[2023-09-22 12:26:16,279][38126] Updated weights for policy 0, policy_version 25280 (0.0018)
[2023-09-22 12:26:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 12943360. Throughput: 0: 785.2, 1: 787.7. Samples: 3233788. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:17,210][36967] Avg episode reward: [(0, '21.250'), (1, '22.450')]
[2023-09-22 12:26:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 12976128. Throughput: 0: 789.2, 1: 787.4. Samples: 3243314. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:22,210][36967] Avg episode reward: [(0, '20.820'), (1, '23.450')]
[2023-09-22 12:26:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13008896. Throughput: 0: 779.1, 1: 781.6. Samples: 3252236. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:27,210][36967] Avg episode reward: [(0, '21.820'), (1, '24.260')]
[2023-09-22 12:26:29,569][38127] Updated weights for policy 1, policy_version 25440 (0.0017)
[2023-09-22 12:26:29,570][38126] Updated weights for policy 0, policy_version 25440 (0.0018)
[2023-09-22 12:26:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13033472. Throughput: 0: 780.2, 1: 777.9. Samples: 3256808. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:32,210][36967] Avg episode reward: [(0, '22.310'), (1, '24.200')]
[2023-09-22 12:26:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13066240. Throughput: 0: 776.4, 1: 775.9. Samples: 3266240. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:37,210][36967] Avg episode reward: [(0, '21.760'), (1, '25.400')]
[2023-09-22 12:26:37,221][37891] Saving new best policy, reward=25.400!
[2023-09-22 12:26:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 13099008. Throughput: 0: 780.7, 1: 781.1. Samples: 3275666. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:42,210][36967] Avg episode reward: [(0, '19.890'), (1, '25.810')]
[2023-09-22 12:26:42,211][37891] Saving new best policy, reward=25.810!
[2023-09-22 12:26:42,661][38127] Updated weights for policy 1, policy_version 25600 (0.0017)
[2023-09-22 12:26:42,661][38126] Updated weights for policy 0, policy_version 25600 (0.0018)
[2023-09-22 12:26:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13131776. Throughput: 0: 778.6, 1: 778.6. Samples: 3280364. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:47,210][36967] Avg episode reward: [(0, '19.260'), (1, '25.590')]
[2023-09-22 12:26:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13164544. Throughput: 0: 775.6, 1: 776.0. Samples: 3289409. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:52,210][36967] Avg episode reward: [(0, '19.600'), (1, '25.300')]
[2023-09-22 12:26:55,965][38126] Updated weights for policy 0, policy_version 25760 (0.0015)
[2023-09-22 12:26:55,966][38127] Updated weights for policy 1, policy_version 25760 (0.0014)
[2023-09-22 12:26:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 13189120. Throughput: 0: 775.9, 1: 776.7. Samples: 3298904. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:26:57,210][36967] Avg episode reward: [(0, '19.660'), (1, '23.310')]
[2023-09-22 12:27:02,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13221888. Throughput: 0: 773.8, 1: 773.8. Samples: 3303428. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:27:02,210][36967] Avg episode reward: [(0, '18.740'), (1, '22.930')]
[2023-09-22 12:27:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13254656. Throughput: 0: 771.8, 1: 771.9. Samples: 3312784. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:27:07,209][36967] Avg episode reward: [(0, '18.940'), (1, '23.040')]
[2023-09-22 12:27:09,270][38126] Updated weights for policy 0, policy_version 25920 (0.0014)
[2023-09-22 12:27:09,271][38127] Updated weights for policy 1, policy_version 25920 (0.0017)
[2023-09-22 12:27:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 13287424. Throughput: 0: 774.8, 1: 773.7. Samples: 3321917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:27:12,209][36967] Avg episode reward: [(0, '19.060'), (1, '23.750')]
[2023-09-22 12:27:17,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 13320192. Throughput: 0: 776.5, 1: 776.4. Samples: 3326686. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:27:17,210][36967] Avg episode reward: [(0, '19.810'), (1, '23.830')]
[2023-09-22 12:27:22,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6192.6). Total num frames: 13344768. Throughput: 0: 775.8, 1: 776.7. Samples: 3336102. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:27:22,210][36967] Avg episode reward: [(0, '20.080'), (1, '22.860')]
[2023-09-22 12:27:22,388][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000026080_6676480.pth...
[2023-09-22 12:27:22,395][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000026080_6676480.pth...
[2023-09-22 12:27:22,397][38127] Updated weights for policy 1, policy_version 26080 (0.0017)
[2023-09-22 12:27:22,397][38126] Updated weights for policy 0, policy_version 26080 (0.0018)
[2023-09-22 12:27:22,423][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000023136_5922816.pth
[2023-09-22 12:27:22,428][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000023136_5922816.pth
[2023-09-22 12:27:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13377536. Throughput: 0: 775.0, 1: 775.1. Samples: 3345421. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:27:27,210][36967] Avg episode reward: [(0, '18.830'), (1, '23.000')]
[2023-09-22 12:27:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 13410304. Throughput: 0: 776.8, 1: 776.1. Samples: 3350245. Policy #0 lag: (min: 1.0, avg: 1.0, max: 1.0)
[2023-09-22 12:27:32,209][36967] Avg episode reward: [(0, '20.090'), (1, '21.640')]
[2023-09-22 12:27:35,368][38127] Updated weights for policy 1, policy_version 26240 (0.0015)
[2023-09-22 12:27:35,368][38126] Updated weights for policy 0, policy_version 26240 (0.0016)
[2023-09-22 12:27:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13443072. Throughput: 0: 779.4, 1: 779.3. Samples: 3359549. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:27:37,209][36967] Avg episode reward: [(0, '20.040'), (1, '21.980')]
[2023-09-22 12:27:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13475840. Throughput: 0: 777.3, 1: 778.0. Samples: 3368891. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:27:42,209][36967] Avg episode reward: [(0, '19.770'), (1, '21.560')]
[2023-09-22 12:27:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13500416. Throughput: 0: 778.2, 1: 775.7. Samples: 3373352. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:27:47,210][36967] Avg episode reward: [(0, '20.010'), (1, '21.220')]
[2023-09-22 12:27:48,501][38126] Updated weights for policy 0, policy_version 26400 (0.0015)
[2023-09-22 12:27:48,502][38127] Updated weights for policy 1, policy_version 26400 (0.0017)
[2023-09-22 12:27:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13533184. Throughput: 0: 781.6, 1: 781.7. Samples: 3383136. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:27:52,210][36967] Avg episode reward: [(0, '21.160'), (1, '22.850')]
[2023-09-22 12:27:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13565952. Throughput: 0: 780.6, 1: 779.4. Samples: 3392117. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:27:57,209][36967] Avg episode reward: [(0, '21.220'), (1, '23.950')]
[2023-09-22 12:28:01,740][38127] Updated weights for policy 1, policy_version 26560 (0.0016)
[2023-09-22 12:28:01,740][38126] Updated weights for policy 0, policy_version 26560 (0.0015)
[2023-09-22 12:28:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13598720. Throughput: 0: 779.9, 1: 779.5. Samples: 3396856. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:28:02,210][36967] Avg episode reward: [(0, '21.260'), (1, '24.820')]
[2023-09-22 12:28:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 13631488. Throughput: 0: 778.6, 1: 777.8. Samples: 3406137. Policy #0 lag: (min: 4.0, avg: 4.0, max: 4.0)
[2023-09-22 12:28:07,209][36967] Avg episode reward: [(0, '21.230'), (1, '24.410')]
[2023-09-22 12:28:12,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13656064. Throughput: 0: 781.1, 1: 781.0. Samples: 3415714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:28:12,210][36967] Avg episode reward: [(0, '21.430'), (1, '23.790')]
[2023-09-22 12:28:14,796][38127] Updated weights for policy 1, policy_version 26720 (0.0018)
[2023-09-22 12:28:14,797][38126] Updated weights for policy 0, policy_version 26720 (0.0017)
[2023-09-22 12:28:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13688832. Throughput: 0: 777.0, 1: 778.5. Samples: 3420241. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:28:17,209][36967] Avg episode reward: [(0, '21.520'), (1, '24.330')]
[2023-09-22 12:28:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 13721600. Throughput: 0: 781.7, 1: 782.0. Samples: 3429917. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:28:22,209][36967] Avg episode reward: [(0, '20.980'), (1, '24.540')]
[2023-09-22 12:28:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13754368. Throughput: 0: 778.5, 1: 777.8. Samples: 3438923. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2023-09-22 12:28:27,209][36967] Avg episode reward: [(0, '20.260'), (1, '25.650')]
[2023-09-22 12:28:27,933][38127] Updated weights for policy 1, policy_version 26880 (0.0015)
[2023-09-22 12:28:27,933][38126] Updated weights for policy 0, policy_version 26880 (0.0017)
[2023-09-22 12:28:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 13787136. Throughput: 0: 779.3, 1: 779.2. Samples: 3443483. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:28:32,210][36967] Avg episode reward: [(0, '20.680'), (1, '24.500')]
[2023-09-22 12:28:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 13819904. Throughput: 0: 775.6, 1: 776.4. Samples: 3452979. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:28:37,209][36967] Avg episode reward: [(0, '21.160'), (1, '25.130')]
[2023-09-22 12:28:41,128][38126] Updated weights for policy 0, policy_version 27040 (0.0019)
[2023-09-22 12:28:41,129][38127] Updated weights for policy 1, policy_version 27040 (0.0019)
[2023-09-22 12:28:42,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 13844480. Throughput: 0: 780.6, 1: 781.2. Samples: 3462396. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:28:42,210][36967] Avg episode reward: [(0, '22.040'), (1, '26.140')]
[2023-09-22 12:28:42,212][37891] Saving new best policy, reward=26.140!
[2023-09-22 12:28:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 13877248. Throughput: 0: 777.6, 1: 778.2. Samples: 3466868. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:28:47,209][36967] Avg episode reward: [(0, '22.040'), (1, '26.090')]
[2023-09-22 12:28:52,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 13905920. Throughput: 0: 769.8, 1: 771.6. Samples: 3475499. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:28:52,210][36967] Avg episode reward: [(0, '21.870'), (1, '26.740')]
[2023-09-22 12:28:52,224][37891] Saving new best policy, reward=26.740!
[2023-09-22 12:28:54,833][38127] Updated weights for policy 1, policy_version 27200 (0.0015)
[2023-09-22 12:28:54,833][38126] Updated weights for policy 0, policy_version 27200 (0.0014)
[2023-09-22 12:28:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 13934592. Throughput: 0: 769.9, 1: 770.5. Samples: 3485033. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:28:57,209][36967] Avg episode reward: [(0, '21.800'), (1, '27.090')]
[2023-09-22 12:28:57,372][37891] Saving new best policy, reward=27.090!
[2023-09-22 12:29:02,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 13967360. Throughput: 0: 771.9, 1: 772.8. Samples: 3489756. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:29:02,210][36967] Avg episode reward: [(0, '22.990'), (1, '26.370')]
[2023-09-22 12:29:02,211][37819] Saving new best policy, reward=22.990!
[2023-09-22 12:29:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14000128. Throughput: 0: 769.4, 1: 768.2. Samples: 3499109. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:29:07,209][36967] Avg episode reward: [(0, '22.450'), (1, '26.210')]
[2023-09-22 12:29:07,884][38127] Updated weights for policy 1, policy_version 27360 (0.0016)
[2023-09-22 12:29:07,884][38126] Updated weights for policy 0, policy_version 27360 (0.0016)
[2023-09-22 12:29:12,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14032896. Throughput: 0: 770.2, 1: 771.2. Samples: 3508285. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:29:12,209][36967] Avg episode reward: [(0, '22.700'), (1, '25.260')]
[2023-09-22 12:29:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14065664. Throughput: 0: 772.3, 1: 772.6. Samples: 3513006. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:29:17,210][36967] Avg episode reward: [(0, '22.760'), (1, '25.020')]
[2023-09-22 12:29:20,918][38127] Updated weights for policy 1, policy_version 27520 (0.0020)
[2023-09-22 12:29:20,918][38126] Updated weights for policy 0, policy_version 27520 (0.0017)
[2023-09-22 12:29:22,209][36967] Fps is (10 sec: 6143.8, 60 sec: 6212.2, 300 sec: 6234.2). Total num frames: 14094336. Throughput: 0: 772.6, 1: 773.7. Samples: 3522562. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:29:22,210][36967] Avg episode reward: [(0, '22.090'), (1, '24.920')]
[2023-09-22 12:29:22,221][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000027536_7049216.pth...
[2023-09-22 12:29:22,253][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000024608_6299648.pth
[2023-09-22 12:29:22,304][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000027536_7049216.pth...
[2023-09-22 12:29:22,344][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000024608_6299648.pth
[2023-09-22 12:29:27,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14123008. Throughput: 0: 772.5, 1: 771.5. Samples: 3531876. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:29:27,209][36967] Avg episode reward: [(0, '22.260'), (1, '25.370')]
[2023-09-22 12:29:32,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 14155776. Throughput: 0: 775.6, 1: 775.6. Samples: 3536670. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:29:32,209][36967] Avg episode reward: [(0, '21.550'), (1, '24.620')]
[2023-09-22 12:29:34,213][38127] Updated weights for policy 1, policy_version 27680 (0.0016)
[2023-09-22 12:29:34,213][38126] Updated weights for policy 0, policy_version 27680 (0.0017)
[2023-09-22 12:29:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 14188544. Throughput: 0: 780.7, 1: 779.3. Samples: 3545700. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:29:37,209][36967] Avg episode reward: [(0, '21.320'), (1, '24.880')]
[2023-09-22 12:29:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14221312. Throughput: 0: 780.2, 1: 782.0. Samples: 3555333. Policy #0 lag: (min: 13.0, avg: 13.0, max: 13.0)
[2023-09-22 12:29:42,209][36967] Avg episode reward: [(0, '20.730'), (1, '23.660')]
[2023-09-22 12:29:47,167][38127] Updated weights for policy 1, policy_version 27840 (0.0018)
[2023-09-22 12:29:47,167][38126] Updated weights for policy 0, policy_version 27840 (0.0018)
[2023-09-22 12:29:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14254080. Throughput: 0: 781.5, 1: 779.5. Samples: 3560002. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:29:47,209][36967] Avg episode reward: [(0, '19.170'), (1, '22.120')]
[2023-09-22 12:29:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 14278656. Throughput: 0: 781.5, 1: 782.6. Samples: 3569494. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:29:52,209][36967] Avg episode reward: [(0, '17.310'), (1, '22.860')]
[2023-09-22 12:29:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14311424. Throughput: 0: 782.0, 1: 780.6. Samples: 3578599. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:29:57,209][36967] Avg episode reward: [(0, '16.980'), (1, '22.490')]
[2023-09-22 12:30:00,390][38126] Updated weights for policy 0, policy_version 28000 (0.0016)
[2023-09-22 12:30:00,391][38127] Updated weights for policy 1, policy_version 28000 (0.0018)
[2023-09-22 12:30:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14344192. Throughput: 0: 781.6, 1: 781.8. Samples: 3583362. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:02,210][36967] Avg episode reward: [(0, '17.040'), (1, '22.200')]
[2023-09-22 12:30:07,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14376960. Throughput: 0: 780.1, 1: 777.6. Samples: 3592658. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:07,210][36967] Avg episode reward: [(0, '15.860'), (1, '23.580')]
[2023-09-22 12:30:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14409728. Throughput: 0: 781.2, 1: 782.7. Samples: 3602250. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:12,210][36967] Avg episode reward: [(0, '16.220'), (1, '23.410')]
[2023-09-22 12:30:13,468][38127] Updated weights for policy 1, policy_version 28160 (0.0016)
[2023-09-22 12:30:13,468][38126] Updated weights for policy 0, policy_version 28160 (0.0015)
[2023-09-22 12:30:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14434304. Throughput: 0: 778.0, 1: 777.6. Samples: 3606675. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:17,210][36967] Avg episode reward: [(0, '16.990'), (1, '23.670')]
[2023-09-22 12:30:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 14467072. Throughput: 0: 787.5, 1: 787.7. Samples: 3616586. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:22,209][36967] Avg episode reward: [(0, '17.340'), (1, '22.930')]
[2023-09-22 12:30:26,465][38127] Updated weights for policy 1, policy_version 28320 (0.0016)
[2023-09-22 12:30:26,465][38126] Updated weights for policy 0, policy_version 28320 (0.0015)
[2023-09-22 12:30:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14499840. Throughput: 0: 782.7, 1: 780.3. Samples: 3625668. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:27,210][36967] Avg episode reward: [(0, '16.670'), (1, '21.570')]
[2023-09-22 12:30:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14532608. Throughput: 0: 783.8, 1: 784.3. Samples: 3630568. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:32,209][36967] Avg episode reward: [(0, '16.620'), (1, '21.770')]
[2023-09-22 12:30:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14565376. Throughput: 0: 782.8, 1: 782.6. Samples: 3639934. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:37,210][36967] Avg episode reward: [(0, '17.280'), (1, '23.360')]
[2023-09-22 12:30:39,408][38127] Updated weights for policy 1, policy_version 28480 (0.0015)
[2023-09-22 12:30:39,410][38126] Updated weights for policy 0, policy_version 28480 (0.0013)
[2023-09-22 12:30:42,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14598144. Throughput: 0: 786.8, 1: 789.2. Samples: 3649520. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:42,209][36967] Avg episode reward: [(0, '18.480'), (1, '24.400')]
[2023-09-22 12:30:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14622720. Throughput: 0: 785.4, 1: 785.2. Samples: 3654042. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:47,210][36967] Avg episode reward: [(0, '19.250'), (1, '25.130')]
[2023-09-22 12:30:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14655488. Throughput: 0: 788.7, 1: 789.5. Samples: 3663678. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:52,210][36967] Avg episode reward: [(0, '20.330'), (1, '26.190')]
[2023-09-22 12:30:52,457][38126] Updated weights for policy 0, policy_version 28640 (0.0016)
[2023-09-22 12:30:52,457][38127] Updated weights for policy 1, policy_version 28640 (0.0017)
[2023-09-22 12:30:57,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14688256. Throughput: 0: 785.4, 1: 784.1. Samples: 3672875. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:30:57,209][36967] Avg episode reward: [(0, '20.490'), (1, '25.910')]
[2023-09-22 12:31:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14721024. Throughput: 0: 789.9, 1: 790.1. Samples: 3677776. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:02,209][36967] Avg episode reward: [(0, '21.360'), (1, '25.100')]
[2023-09-22 12:31:05,476][38126] Updated weights for policy 0, policy_version 28800 (0.0016)
[2023-09-22 12:31:05,476][38127] Updated weights for policy 1, policy_version 28800 (0.0019)
[2023-09-22 12:31:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14753792. Throughput: 0: 783.0, 1: 782.8. Samples: 3687046. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:07,209][36967] Avg episode reward: [(0, '21.340'), (1, '24.160')]
[2023-09-22 12:31:12,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14786560. Throughput: 0: 786.8, 1: 785.8. Samples: 3696437. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:12,210][36967] Avg episode reward: [(0, '21.080'), (1, '23.740')]
[2023-09-22 12:31:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14811136. Throughput: 0: 780.8, 1: 780.9. Samples: 3700848. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:17,210][36967] Avg episode reward: [(0, '20.680'), (1, '24.970')]
[2023-09-22 12:31:18,743][38127] Updated weights for policy 1, policy_version 28960 (0.0016)
[2023-09-22 12:31:18,744][38126] Updated weights for policy 0, policy_version 28960 (0.0017)
[2023-09-22 12:31:22,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14843904. Throughput: 0: 783.1, 1: 782.7. Samples: 3710398. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:22,209][36967] Avg episode reward: [(0, '21.260'), (1, '24.150')]
[2023-09-22 12:31:22,218][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000028992_7421952.pth...
[2023-09-22 12:31:22,219][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000028992_7421952.pth...
[2023-09-22 12:31:22,253][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000026080_6676480.pth
[2023-09-22 12:31:22,254][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000026080_6676480.pth
[2023-09-22 12:31:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14876672. Throughput: 0: 779.2, 1: 776.8. Samples: 3719539. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:27,210][36967] Avg episode reward: [(0, '21.460'), (1, '25.110')]
[2023-09-22 12:31:31,946][38127] Updated weights for policy 1, policy_version 29120 (0.0017)
[2023-09-22 12:31:31,946][38126] Updated weights for policy 0, policy_version 29120 (0.0018)
[2023-09-22 12:31:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 14909440. Throughput: 0: 781.7, 1: 781.3. Samples: 3724379. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:32,210][36967] Avg episode reward: [(0, '20.800'), (1, '24.730')]
[2023-09-22 12:31:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 14942208. Throughput: 0: 776.6, 1: 776.8. Samples: 3733581. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:31:37,209][36967] Avg episode reward: [(0, '19.230'), (1, '23.500')]
[2023-09-22 12:31:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 14966784. Throughput: 0: 780.7, 1: 781.0. Samples: 3743152. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:31:42,210][36967] Avg episode reward: [(0, '19.760'), (1, '22.980')]
[2023-09-22 12:31:45,067][38127] Updated weights for policy 1, policy_version 29280 (0.0017)
[2023-09-22 12:31:45,067][38126] Updated weights for policy 0, policy_version 29280 (0.0018)
[2023-09-22 12:31:47,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 14999552. Throughput: 0: 777.3, 1: 778.8. Samples: 3747798. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:31:47,210][36967] Avg episode reward: [(0, '19.780'), (1, '21.240')]
[2023-09-22 12:31:52,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15032320. Throughput: 0: 777.9, 1: 778.5. Samples: 3757087. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:31:52,209][36967] Avg episode reward: [(0, '18.300'), (1, '21.170')]
[2023-09-22 12:31:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15065088. Throughput: 0: 774.2, 1: 777.7. Samples: 3766272. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:31:57,210][36967] Avg episode reward: [(0, '18.400'), (1, '20.870')]
[2023-09-22 12:31:58,236][38127] Updated weights for policy 1, policy_version 29440 (0.0017)
[2023-09-22 12:31:58,236][38126] Updated weights for policy 0, policy_version 29440 (0.0017)
[2023-09-22 12:32:02,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15089664. Throughput: 0: 777.8, 1: 777.0. Samples: 3770815. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:32:02,210][36967] Avg episode reward: [(0, '18.250'), (1, '21.240')]
[2023-09-22 12:32:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15122432. Throughput: 0: 773.5, 1: 774.2. Samples: 3780043. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:32:07,210][36967] Avg episode reward: [(0, '17.820'), (1, '22.200')]
[2023-09-22 12:32:11,628][38127] Updated weights for policy 1, policy_version 29600 (0.0015)
[2023-09-22 12:32:11,628][38126] Updated weights for policy 0, policy_version 29600 (0.0013)
[2023-09-22 12:32:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15155200. Throughput: 0: 773.5, 1: 773.7. Samples: 3789161. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:32:12,210][36967] Avg episode reward: [(0, '16.830'), (1, '20.810')]
[2023-09-22 12:32:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15187968. Throughput: 0: 774.6, 1: 774.6. Samples: 3794097. Policy #0 lag: (min: 2.0, avg: 2.0, max: 2.0)
[2023-09-22 12:32:17,210][36967] Avg episode reward: [(0, '17.230'), (1, '21.150')]
[2023-09-22 12:32:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15220736. Throughput: 0: 775.4, 1: 774.6. Samples: 3803332. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:32:22,210][36967] Avg episode reward: [(0, '16.450'), (1, '21.420')]
[2023-09-22 12:32:24,768][38126] Updated weights for policy 0, policy_version 29760 (0.0017)
[2023-09-22 12:32:24,768][38127] Updated weights for policy 1, policy_version 29760 (0.0017)
[2023-09-22 12:32:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15245312. Throughput: 0: 772.9, 1: 772.2. Samples: 3812684. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:32:27,210][36967] Avg episode reward: [(0, '18.140'), (1, '22.040')]
[2023-09-22 12:32:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15278080. Throughput: 0: 773.7, 1: 774.6. Samples: 3817472. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:32:32,210][36967] Avg episode reward: [(0, '17.800'), (1, '24.030')]
[2023-09-22 12:32:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15310848. Throughput: 0: 773.2, 1: 772.4. Samples: 3826639. Policy #0 lag: (min: 11.0, avg: 11.0, max: 11.0)
[2023-09-22 12:32:37,210][36967] Avg episode reward: [(0, '17.510'), (1, '24.610')]
[2023-09-22 12:32:37,953][38126] Updated weights for policy 0, policy_version 29920 (0.0016)
[2023-09-22 12:32:37,954][38127] Updated weights for policy 1, policy_version 29920 (0.0017)
[2023-09-22 12:32:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15343616. Throughput: 0: 774.0, 1: 773.7. Samples: 3835917. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:32:42,209][36967] Avg episode reward: [(0, '18.340'), (1, '23.790')]
[2023-09-22 12:32:47,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.2). Total num frames: 15372288. Throughput: 0: 773.8, 1: 774.4. Samples: 3840484. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:32:47,210][36967] Avg episode reward: [(0, '18.430'), (1, '23.870')]
[2023-09-22 12:32:51,110][38127] Updated weights for policy 1, policy_version 30080 (0.0016)
[2023-09-22 12:32:51,111][38126] Updated weights for policy 0, policy_version 30080 (0.0017)
[2023-09-22 12:32:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15400960. Throughput: 0: 779.0, 1: 778.5. Samples: 3850129. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:32:52,209][36967] Avg episode reward: [(0, '18.310'), (1, '23.970')]
[2023-09-22 12:32:57,209][36967] Fps is (10 sec: 6144.2, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15433728. Throughput: 0: 780.2, 1: 779.3. Samples: 3859339. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:32:57,209][36967] Avg episode reward: [(0, '18.850'), (1, '23.890')]
[2023-09-22 12:33:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 15466496. Throughput: 0: 776.2, 1: 777.4. Samples: 3864010. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:02,209][36967] Avg episode reward: [(0, '21.340'), (1, '23.750')]
[2023-09-22 12:33:04,339][38127] Updated weights for policy 1, policy_version 30240 (0.0019)
[2023-09-22 12:33:04,339][38126] Updated weights for policy 0, policy_version 30240 (0.0018)
[2023-09-22 12:33:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15499264. Throughput: 0: 775.8, 1: 775.3. Samples: 3873129. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:07,209][36967] Avg episode reward: [(0, '21.170'), (1, '24.060')]
[2023-09-22 12:33:12,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15523840. Throughput: 0: 775.6, 1: 775.9. Samples: 3882499. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:12,210][36967] Avg episode reward: [(0, '22.370'), (1, '24.170')]
[2023-09-22 12:33:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15556608. Throughput: 0: 773.7, 1: 773.7. Samples: 3887104. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:17,210][36967] Avg episode reward: [(0, '22.180'), (1, '24.310')]
[2023-09-22 12:33:17,654][38126] Updated weights for policy 0, policy_version 30400 (0.0019)
[2023-09-22 12:33:17,656][38127] Updated weights for policy 1, policy_version 30400 (0.0021)
[2023-09-22 12:33:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15589376. Throughput: 0: 774.5, 1: 775.1. Samples: 3896372. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:22,210][36967] Avg episode reward: [(0, '22.850'), (1, '25.820')]
[2023-09-22 12:33:22,222][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000030448_7794688.pth...
[2023-09-22 12:33:22,223][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000030448_7794688.pth...
[2023-09-22 12:33:22,258][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000027536_7049216.pth
[2023-09-22 12:33:22,260][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000027536_7049216.pth
[2023-09-22 12:33:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 15622144. Throughput: 0: 778.5, 1: 776.4. Samples: 3905886. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:27,210][36967] Avg episode reward: [(0, '23.280'), (1, '26.060')]
[2023-09-22 12:33:27,211][37819] Saving new best policy, reward=23.280!
[2023-09-22 12:33:30,537][38127] Updated weights for policy 1, policy_version 30560 (0.0016)
[2023-09-22 12:33:30,537][38126] Updated weights for policy 0, policy_version 30560 (0.0017)
[2023-09-22 12:33:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 15654912. Throughput: 0: 780.5, 1: 780.8. Samples: 3910741. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:32,209][36967] Avg episode reward: [(0, '21.780'), (1, '24.280')]
[2023-09-22 12:33:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15687680. Throughput: 0: 776.8, 1: 776.6. Samples: 3920035. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:37,210][36967] Avg episode reward: [(0, '21.080'), (1, '23.710')]
[2023-09-22 12:33:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15720448. Throughput: 0: 783.0, 1: 784.4. Samples: 3929872. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:42,210][36967] Avg episode reward: [(0, '21.520'), (1, '24.810')]
[2023-09-22 12:33:43,545][38126] Updated weights for policy 0, policy_version 30720 (0.0017)
[2023-09-22 12:33:43,545][38127] Updated weights for policy 1, policy_version 30720 (0.0018)
[2023-09-22 12:33:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 15745024. Throughput: 0: 780.8, 1: 780.8. Samples: 3934282. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:47,210][36967] Avg episode reward: [(0, '22.770'), (1, '23.670')]
[2023-09-22 12:33:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15777792. Throughput: 0: 786.9, 1: 786.7. Samples: 3943942. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:52,210][36967] Avg episode reward: [(0, '22.270'), (1, '24.130')]
[2023-09-22 12:33:56,597][38126] Updated weights for policy 0, policy_version 30880 (0.0015)
[2023-09-22 12:33:56,597][38127] Updated weights for policy 1, policy_version 30880 (0.0017)
[2023-09-22 12:33:57,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15810560. Throughput: 0: 784.3, 1: 784.1. Samples: 3953076. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:33:57,210][36967] Avg episode reward: [(0, '20.810'), (1, '24.360')]
[2023-09-22 12:34:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15843328. Throughput: 0: 788.7, 1: 785.8. Samples: 3957957. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:02,209][36967] Avg episode reward: [(0, '20.480'), (1, '25.330')]
[2023-09-22 12:34:07,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 15867904. Throughput: 0: 783.6, 1: 785.4. Samples: 3966981. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:07,210][36967] Avg episode reward: [(0, '20.130'), (1, '26.200')]
[2023-09-22 12:34:09,950][38127] Updated weights for policy 1, policy_version 31040 (0.0015)
[2023-09-22 12:34:09,951][38126] Updated weights for policy 0, policy_version 31040 (0.0017)
[2023-09-22 12:34:12,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 15900672. Throughput: 0: 782.5, 1: 782.1. Samples: 3976296. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:12,210][36967] Avg episode reward: [(0, '20.570'), (1, '26.760')]
[2023-09-22 12:34:17,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6234.3). Total num frames: 15933440. Throughput: 0: 781.7, 1: 782.0. Samples: 3981104. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:17,209][36967] Avg episode reward: [(0, '21.150'), (1, '26.110')]
[2023-09-22 12:34:22,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 15966208. Throughput: 0: 778.4, 1: 778.8. Samples: 3990108. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:22,209][36967] Avg episode reward: [(0, '19.780'), (1, '26.120')]
[2023-09-22 12:34:23,108][38127] Updated weights for policy 1, policy_version 31200 (0.0015)
[2023-09-22 12:34:23,109][38126] Updated weights for policy 0, policy_version 31200 (0.0018)
[2023-09-22 12:34:27,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 15998976. Throughput: 0: 775.3, 1: 777.3. Samples: 3999740. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:27,210][36967] Avg episode reward: [(0, '21.340'), (1, '24.600')]
[2023-09-22 12:34:32,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.3, 300 sec: 6234.3). Total num frames: 16027648. Throughput: 0: 777.6, 1: 776.7. Samples: 4004225. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:32,210][36967] Avg episode reward: [(0, '22.430'), (1, '25.120')]
[2023-09-22 12:34:36,411][38126] Updated weights for policy 0, policy_version 31360 (0.0015)
[2023-09-22 12:34:36,411][38127] Updated weights for policy 1, policy_version 31360 (0.0016)
[2023-09-22 12:34:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16056320. Throughput: 0: 772.4, 1: 773.4. Samples: 4013502. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:37,209][36967] Avg episode reward: [(0, '23.020'), (1, '26.070')]
[2023-09-22 12:34:42,209][36967] Fps is (10 sec: 6143.9, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16089088. Throughput: 0: 773.4, 1: 773.2. Samples: 4022672. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:42,210][36967] Avg episode reward: [(0, '21.780'), (1, '24.120')]
[2023-09-22 12:34:47,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16121856. Throughput: 0: 772.8, 1: 773.3. Samples: 4027532. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:47,210][36967] Avg episode reward: [(0, '21.020'), (1, '24.580')]
[2023-09-22 12:34:49,556][38126] Updated weights for policy 0, policy_version 31520 (0.0016)
[2023-09-22 12:34:49,557][38127] Updated weights for policy 1, policy_version 31520 (0.0016)
[2023-09-22 12:34:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16154624. Throughput: 0: 775.2, 1: 773.8. Samples: 4036686. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:52,210][36967] Avg episode reward: [(0, '20.990'), (1, '26.020')]
[2023-09-22 12:34:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16179200. Throughput: 0: 778.6, 1: 779.0. Samples: 4046388. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:34:57,210][36967] Avg episode reward: [(0, '21.360'), (1, '25.360')]
[2023-09-22 12:35:02,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16211968. Throughput: 0: 775.2, 1: 777.0. Samples: 4050953. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:35:02,209][36967] Avg episode reward: [(0, '20.380'), (1, '25.660')]
[2023-09-22 12:35:02,527][38126] Updated weights for policy 0, policy_version 31680 (0.0016)
[2023-09-22 12:35:02,528][38127] Updated weights for policy 1, policy_version 31680 (0.0016)
[2023-09-22 12:35:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 16244736. Throughput: 0: 781.2, 1: 780.3. Samples: 4060374. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:35:07,209][36967] Avg episode reward: [(0, '19.620'), (1, '24.810')]
[2023-09-22 12:35:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16277504. Throughput: 0: 776.8, 1: 774.5. Samples: 4069550. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:35:12,209][36967] Avg episode reward: [(0, '20.560'), (1, '22.840')]
[2023-09-22 12:35:15,577][38126] Updated weights for policy 0, policy_version 31840 (0.0019)
[2023-09-22 12:35:15,578][38127] Updated weights for policy 1, policy_version 31840 (0.0016)
[2023-09-22 12:35:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16310272. Throughput: 0: 781.0, 1: 780.9. Samples: 4074510. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:35:17,210][36967] Avg episode reward: [(0, '21.490'), (1, '21.830')]
[2023-09-22 12:35:22,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16343040. Throughput: 0: 779.9, 1: 781.1. Samples: 4083748. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:35:22,209][36967] Avg episode reward: [(0, '21.280'), (1, '21.660')]
[2023-09-22 12:35:22,218][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000031920_8171520.pth...
[2023-09-22 12:35:22,218][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000031920_8171520.pth...
[2023-09-22 12:35:22,247][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000028992_7421952.pth
[2023-09-22 12:35:22,253][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000028992_7421952.pth
[2023-09-22 12:35:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16367616. Throughput: 0: 784.1, 1: 783.5. Samples: 4093214. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:35:27,210][36967] Avg episode reward: [(0, '19.390'), (1, '22.230')]
[2023-09-22 12:35:28,797][38126] Updated weights for policy 0, policy_version 32000 (0.0014)
[2023-09-22 12:35:28,797][38127] Updated weights for policy 1, policy_version 32000 (0.0015)
[2023-09-22 12:35:32,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6212.3, 300 sec: 6220.4). Total num frames: 16400384. Throughput: 0: 782.4, 1: 782.6. Samples: 4097956. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:35:32,209][36967] Avg episode reward: [(0, '18.850'), (1, '21.540')]
[2023-09-22 12:35:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 16433152. Throughput: 0: 785.7, 1: 783.6. Samples: 4107302. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:35:37,210][36967] Avg episode reward: [(0, '18.900'), (1, '22.540')]
[2023-09-22 12:35:41,741][38127] Updated weights for policy 1, policy_version 32160 (0.0016)
[2023-09-22 12:35:41,742][38126] Updated weights for policy 0, policy_version 32160 (0.0017)
[2023-09-22 12:35:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16465920. Throughput: 0: 780.4, 1: 780.3. Samples: 4116621. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:35:42,210][36967] Avg episode reward: [(0, '19.260'), (1, '22.570')]
[2023-09-22 12:35:47,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 16498688. Throughput: 0: 783.4, 1: 781.0. Samples: 4121353. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:35:47,209][36967] Avg episode reward: [(0, '19.980'), (1, '22.470')]
[2023-09-22 12:35:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16531456. Throughput: 0: 781.2, 1: 784.3. Samples: 4130820. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:35:52,210][36967] Avg episode reward: [(0, '20.900'), (1, '21.760')]
[2023-09-22 12:35:54,872][38127] Updated weights for policy 1, policy_version 32320 (0.0016)
[2023-09-22 12:35:54,872][38126] Updated weights for policy 0, policy_version 32320 (0.0016)
[2023-09-22 12:35:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 16556032. Throughput: 0: 786.5, 1: 787.1. Samples: 4140362. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:35:57,210][36967] Avg episode reward: [(0, '20.980'), (1, '20.970')]
[2023-09-22 12:36:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 16588800. Throughput: 0: 783.6, 1: 786.2. Samples: 4145152. Policy #0 lag: (min: 5.0, avg: 5.0, max: 5.0)
[2023-09-22 12:36:02,210][36967] Avg episode reward: [(0, '21.920'), (1, '21.380')]
[2023-09-22 12:36:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 16621568. Throughput: 0: 785.7, 1: 784.0. Samples: 4154383. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:07,209][36967] Avg episode reward: [(0, '19.100'), (1, '22.150')]
[2023-09-22 12:36:07,973][38126] Updated weights for policy 0, policy_version 32480 (0.0016)
[2023-09-22 12:36:07,973][38127] Updated weights for policy 1, policy_version 32480 (0.0015)
[2023-09-22 12:36:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16654336. Throughput: 0: 780.1, 1: 783.7. Samples: 4163585. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:12,210][36967] Avg episode reward: [(0, '19.250'), (1, '22.500')]
[2023-09-22 12:36:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16687104. Throughput: 0: 783.1, 1: 782.6. Samples: 4168410. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:17,209][36967] Avg episode reward: [(0, '19.700'), (1, '23.460')]
[2023-09-22 12:36:20,969][38126] Updated weights for policy 0, policy_version 32640 (0.0013)
[2023-09-22 12:36:20,970][38127] Updated weights for policy 1, policy_version 32640 (0.0017)
[2023-09-22 12:36:22,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16711680. Throughput: 0: 782.8, 1: 786.5. Samples: 4177920. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:22,210][36967] Avg episode reward: [(0, '20.670'), (1, '23.890')]
[2023-09-22 12:36:27,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.6, 300 sec: 6220.4). Total num frames: 16744448. Throughput: 0: 785.5, 1: 785.3. Samples: 4187307. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:27,209][36967] Avg episode reward: [(0, '21.650'), (1, '24.620')]
[2023-09-22 12:36:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6220.4). Total num frames: 16777216. Throughput: 0: 786.0, 1: 785.9. Samples: 4192088. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:36:32,209][36967] Avg episode reward: [(0, '22.880'), (1, '23.640')]
[2023-09-22 12:36:34,151][38127] Updated weights for policy 1, policy_version 32800 (0.0014)
[2023-09-22 12:36:34,153][38126] Updated weights for policy 0, policy_version 32800 (0.0018)
[2023-09-22 12:36:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16809984. Throughput: 0: 783.0, 1: 780.6. Samples: 4201182. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:36:37,209][36967] Avg episode reward: [(0, '23.100'), (1, '24.160')]
[2023-09-22 12:36:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 16842752. Throughput: 0: 780.5, 1: 782.3. Samples: 4210688. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:36:42,209][36967] Avg episode reward: [(0, '22.300'), (1, '24.160')]
[2023-09-22 12:36:47,190][38127] Updated weights for policy 1, policy_version 32960 (0.0016)
[2023-09-22 12:36:47,191][38126] Updated weights for policy 0, policy_version 32960 (0.0016)
[2023-09-22 12:36:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16875520. Throughput: 0: 780.2, 1: 777.3. Samples: 4215238. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:36:47,210][36967] Avg episode reward: [(0, '23.030'), (1, '22.950')]
[2023-09-22 12:36:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 16900096. Throughput: 0: 783.6, 1: 783.6. Samples: 4224909. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:52,210][36967] Avg episode reward: [(0, '23.960'), (1, '23.560')]
[2023-09-22 12:36:52,378][37819] Saving new best policy, reward=23.960!
[2023-09-22 12:36:57,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16932864. Throughput: 0: 788.0, 1: 785.2. Samples: 4234375. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:36:57,209][36967] Avg episode reward: [(0, '23.660'), (1, '24.380')]
[2023-09-22 12:37:00,056][38126] Updated weights for policy 0, policy_version 33120 (0.0015)
[2023-09-22 12:37:00,056][38127] Updated weights for policy 1, policy_version 33120 (0.0018)
[2023-09-22 12:37:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16965632. Throughput: 0: 787.0, 1: 788.1. Samples: 4239293. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:02,210][36967] Avg episode reward: [(0, '22.790'), (1, '25.340')]
[2023-09-22 12:37:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 16998400. Throughput: 0: 785.9, 1: 784.6. Samples: 4248592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:07,209][36967] Avg episode reward: [(0, '23.220'), (1, '24.500')]
[2023-09-22 12:37:12,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 17031168. Throughput: 0: 783.5, 1: 784.6. Samples: 4257869. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:12,209][36967] Avg episode reward: [(0, '22.970'), (1, '24.210')]
[2023-09-22 12:37:13,119][38127] Updated weights for policy 1, policy_version 33280 (0.0017)
[2023-09-22 12:37:13,119][38126] Updated weights for policy 0, policy_version 33280 (0.0016)
[2023-09-22 12:37:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17063936. Throughput: 0: 783.8, 1: 783.5. Samples: 4262617. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:17,209][36967] Avg episode reward: [(0, '23.100'), (1, '24.110')]
[2023-09-22 12:37:22,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17088512. Throughput: 0: 787.1, 1: 789.4. Samples: 4272125. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:22,210][36967] Avg episode reward: [(0, '22.730'), (1, '23.250')]
[2023-09-22 12:37:22,315][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000033392_8548352.pth...
[2023-09-22 12:37:22,326][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000033392_8548352.pth...
[2023-09-22 12:37:22,345][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000030448_7794688.pth
[2023-09-22 12:37:22,358][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000030448_7794688.pth
[2023-09-22 12:37:26,192][38127] Updated weights for policy 1, policy_version 33440 (0.0015)
[2023-09-22 12:37:26,192][38126] Updated weights for policy 0, policy_version 33440 (0.0018)
[2023-09-22 12:37:27,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17121280. Throughput: 0: 788.7, 1: 786.3. Samples: 4281566. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:27,210][36967] Avg episode reward: [(0, '21.960'), (1, '23.030')]
[2023-09-22 12:37:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17154048. Throughput: 0: 789.9, 1: 792.7. Samples: 4286455. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:37:32,209][36967] Avg episode reward: [(0, '21.460'), (1, '24.470')]
[2023-09-22 12:37:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17186816. Throughput: 0: 785.7, 1: 786.6. Samples: 4295664. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:37:37,209][36967] Avg episode reward: [(0, '22.460'), (1, '25.300')]
[2023-09-22 12:37:39,291][38127] Updated weights for policy 1, policy_version 33600 (0.0013)
[2023-09-22 12:37:39,292][38126] Updated weights for policy 0, policy_version 33600 (0.0017)
[2023-09-22 12:37:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 17219584. Throughput: 0: 782.2, 1: 784.6. Samples: 4304879. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:37:42,210][36967] Avg episode reward: [(0, '24.320'), (1, '24.650')]
[2023-09-22 12:37:42,211][37819] Saving new best policy, reward=24.320!
[2023-09-22 12:37:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17252352. Throughput: 0: 779.9, 1: 778.6. Samples: 4309424. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:37:47,210][36967] Avg episode reward: [(0, '24.850'), (1, '24.890')]
[2023-09-22 12:37:47,210][37819] Saving new best policy, reward=24.850!
[2023-09-22 12:37:52,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17276928. Throughput: 0: 784.2, 1: 784.2. Samples: 4319171. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:37:52,209][36967] Avg episode reward: [(0, '22.590'), (1, '25.620')]
[2023-09-22 12:37:52,396][38126] Updated weights for policy 0, policy_version 33760 (0.0014)
[2023-09-22 12:37:52,397][38127] Updated weights for policy 1, policy_version 33760 (0.0017)
[2023-09-22 12:37:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17309696. Throughput: 0: 785.7, 1: 784.4. Samples: 4328522. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:37:57,209][36967] Avg episode reward: [(0, '23.990'), (1, '26.870')]
[2023-09-22 12:38:02,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 17342464. Throughput: 0: 786.9, 1: 786.8. Samples: 4333432. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:02,209][36967] Avg episode reward: [(0, '23.560'), (1, '25.770')]
[2023-09-22 12:38:05,322][38126] Updated weights for policy 0, policy_version 33920 (0.0016)
[2023-09-22 12:38:05,322][38127] Updated weights for policy 1, policy_version 33920 (0.0018)
[2023-09-22 12:38:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17375232. Throughput: 0: 785.8, 1: 783.3. Samples: 4342737. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:07,210][36967] Avg episode reward: [(0, '24.030'), (1, '25.120')]
[2023-09-22 12:38:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17408000. Throughput: 0: 781.4, 1: 783.8. Samples: 4351998. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:12,209][36967] Avg episode reward: [(0, '23.870'), (1, '25.450')]
[2023-09-22 12:38:17,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17440768. Throughput: 0: 779.9, 1: 777.3. Samples: 4356531. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:17,210][36967] Avg episode reward: [(0, '23.570'), (1, '25.750')]
[2023-09-22 12:38:18,405][38126] Updated weights for policy 0, policy_version 34080 (0.0016)
[2023-09-22 12:38:18,405][38127] Updated weights for policy 1, policy_version 34080 (0.0016)
[2023-09-22 12:38:22,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 17465344. Throughput: 0: 784.2, 1: 783.7. Samples: 4366221. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:22,209][36967] Avg episode reward: [(0, '24.690'), (1, '26.440')]
[2023-09-22 12:38:27,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17498112. Throughput: 0: 786.7, 1: 784.8. Samples: 4375596. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:27,210][36967] Avg episode reward: [(0, '24.620'), (1, '27.240')]
[2023-09-22 12:38:27,211][37891] Saving new best policy, reward=27.240!
[2023-09-22 12:38:31,430][38127] Updated weights for policy 1, policy_version 34240 (0.0015)
[2023-09-22 12:38:31,431][38126] Updated weights for policy 0, policy_version 34240 (0.0017)
[2023-09-22 12:38:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17530880. Throughput: 0: 789.9, 1: 790.0. Samples: 4380518. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:32,209][36967] Avg episode reward: [(0, '24.510'), (1, '26.850')]
[2023-09-22 12:38:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17563648. Throughput: 0: 784.8, 1: 784.2. Samples: 4389778. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:38:37,209][36967] Avg episode reward: [(0, '24.940'), (1, '27.340')]
[2023-09-22 12:38:37,218][37819] Saving new best policy, reward=24.940!
[2023-09-22 12:38:37,219][37891] Saving new best policy, reward=27.340!
[2023-09-22 12:38:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17596416. Throughput: 0: 782.8, 1: 784.9. Samples: 4399072. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:38:42,210][36967] Avg episode reward: [(0, '24.480'), (1, '26.890')]
[2023-09-22 12:38:44,584][38127] Updated weights for policy 1, policy_version 34400 (0.0015)
[2023-09-22 12:38:44,585][38126] Updated weights for policy 0, policy_version 34400 (0.0017)
[2023-09-22 12:38:47,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17629184. Throughput: 0: 779.9, 1: 780.6. Samples: 4403655. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:38:47,209][36967] Avg episode reward: [(0, '24.910'), (1, '27.360')]
[2023-09-22 12:38:47,210][37891] Saving new best policy, reward=27.360!
[2023-09-22 12:38:52,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17653760. Throughput: 0: 783.7, 1: 784.4. Samples: 4413298. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:38:52,210][36967] Avg episode reward: [(0, '25.820'), (1, '26.610')]
[2023-09-22 12:38:52,219][37819] Saving new best policy, reward=25.820!
[2023-09-22 12:38:57,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17686528. Throughput: 0: 782.8, 1: 780.0. Samples: 4422323. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:38:57,210][36967] Avg episode reward: [(0, '26.720'), (1, '26.890')]
[2023-09-22 12:38:57,211][37819] Saving new best policy, reward=26.720!
[2023-09-22 12:38:57,728][38127] Updated weights for policy 1, policy_version 34560 (0.0015)
[2023-09-22 12:38:57,729][38126] Updated weights for policy 0, policy_version 34560 (0.0015)
[2023-09-22 12:39:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17719296. Throughput: 0: 785.0, 1: 785.0. Samples: 4427178. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:02,210][36967] Avg episode reward: [(0, '26.170'), (1, '26.100')]
[2023-09-22 12:39:07,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17752064. Throughput: 0: 782.4, 1: 781.6. Samples: 4436600. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:07,210][36967] Avg episode reward: [(0, '26.320'), (1, '25.610')]
[2023-09-22 12:39:10,689][38127] Updated weights for policy 1, policy_version 34720 (0.0015)
[2023-09-22 12:39:10,689][38126] Updated weights for policy 0, policy_version 34720 (0.0017)
[2023-09-22 12:39:12,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17784832. Throughput: 0: 783.5, 1: 785.7. Samples: 4446208. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:12,209][36967] Avg episode reward: [(0, '26.210'), (1, '25.340')]
[2023-09-22 12:39:17,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17809408. Throughput: 0: 779.4, 1: 779.4. Samples: 4450666. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:17,210][36967] Avg episode reward: [(0, '27.840'), (1, '25.940')]
[2023-09-22 12:39:17,286][37819] Saving new best policy, reward=27.840!
[2023-09-22 12:39:22,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 17842176. Throughput: 0: 781.8, 1: 781.5. Samples: 4460129. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:22,210][36967] Avg episode reward: [(0, '28.840'), (1, '26.270')]
[2023-09-22 12:39:22,223][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000034848_8921088.pth...
[2023-09-22 12:39:22,223][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000034848_8921088.pth...
[2023-09-22 12:39:22,273][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000031920_8171520.pth
[2023-09-22 12:39:22,274][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000031920_8171520.pth
[2023-09-22 12:39:22,279][37819] Saving new best policy, reward=28.840!
[2023-09-22 12:39:23,975][38126] Updated weights for policy 0, policy_version 34880 (0.0015)
[2023-09-22 12:39:23,976][38127] Updated weights for policy 1, policy_version 34880 (0.0018)
[2023-09-22 12:39:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6262.0). Total num frames: 17874944. Throughput: 0: 777.3, 1: 775.2. Samples: 4468933. Policy #0 lag: (min: 9.0, avg: 9.0, max: 9.0)
[2023-09-22 12:39:27,210][36967] Avg episode reward: [(0, '29.870'), (1, '26.180')]
[2023-09-22 12:39:27,211][37819] Saving new best policy, reward=29.870!
[2023-09-22 12:39:32,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17907712. Throughput: 0: 780.9, 1: 780.2. Samples: 4473904. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:39:32,209][36967] Avg episode reward: [(0, '29.790'), (1, '24.840')]
[2023-09-22 12:39:37,123][38126] Updated weights for policy 0, policy_version 35040 (0.0017)
[2023-09-22 12:39:37,123][38127] Updated weights for policy 1, policy_version 35040 (0.0016)
[2023-09-22 12:39:37,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 17940480. Throughput: 0: 776.6, 1: 776.3. Samples: 4483179. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:39:37,210][36967] Avg episode reward: [(0, '29.180'), (1, '23.130')]
[2023-09-22 12:39:42,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17965056. Throughput: 0: 783.3, 1: 783.8. Samples: 4492843. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:39:42,210][36967] Avg episode reward: [(0, '28.600'), (1, '22.450')]
[2023-09-22 12:39:47,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 17997824. Throughput: 0: 779.3, 1: 781.7. Samples: 4497422. Policy #0 lag: (min: 12.0, avg: 12.0, max: 12.0)
[2023-09-22 12:39:47,210][36967] Avg episode reward: [(0, '29.140'), (1, '22.360')]
[2023-09-22 12:39:50,121][38126] Updated weights for policy 0, policy_version 35200 (0.0018)
[2023-09-22 12:39:50,122][38127] Updated weights for policy 1, policy_version 35200 (0.0019)
[2023-09-22 12:39:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18030592. Throughput: 0: 780.6, 1: 781.5. Samples: 4506897. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:39:52,210][36967] Avg episode reward: [(0, '28.090'), (1, '23.770')]
[2023-09-22 12:39:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18063360. Throughput: 0: 776.7, 1: 774.0. Samples: 4515990. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:39:57,210][36967] Avg episode reward: [(0, '25.720'), (1, '23.820')]
[2023-09-22 12:40:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18096128. Throughput: 0: 780.5, 1: 781.2. Samples: 4520942. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:02,210][36967] Avg episode reward: [(0, '24.540'), (1, '24.340')]
[2023-09-22 12:40:03,184][38126] Updated weights for policy 0, policy_version 35360 (0.0017)
[2023-09-22 12:40:03,185][38127] Updated weights for policy 1, policy_version 35360 (0.0018)
[2023-09-22 12:40:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18128896. Throughput: 0: 779.0, 1: 779.5. Samples: 4530261. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:07,209][36967] Avg episode reward: [(0, '23.330'), (1, '24.530')]
[2023-09-22 12:40:12,209][36967] Fps is (10 sec: 6144.0, 60 sec: 6212.2, 300 sec: 6262.0). Total num frames: 18157568. Throughput: 0: 790.8, 1: 790.5. Samples: 4540093. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:12,210][36967] Avg episode reward: [(0, '20.690'), (1, '24.840')]
[2023-09-22 12:40:16,139][38127] Updated weights for policy 1, policy_version 35520 (0.0016)
[2023-09-22 12:40:16,140][38126] Updated weights for policy 0, policy_version 35520 (0.0019)
[2023-09-22 12:40:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18186240. Throughput: 0: 783.9, 1: 786.0. Samples: 4544551. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:17,210][36967] Avg episode reward: [(0, '20.200'), (1, '25.420')]
[2023-09-22 12:40:22,209][36967] Fps is (10 sec: 6144.1, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18219008. Throughput: 0: 788.6, 1: 788.7. Samples: 4554154. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:22,209][36967] Avg episode reward: [(0, '19.900'), (1, '25.320')]
[2023-09-22 12:40:27,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18251776. Throughput: 0: 785.0, 1: 784.5. Samples: 4563470. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:27,209][36967] Avg episode reward: [(0, '21.550'), (1, '26.720')]
[2023-09-22 12:40:29,183][38126] Updated weights for policy 0, policy_version 35680 (0.0014)
[2023-09-22 12:40:29,183][38127] Updated weights for policy 1, policy_version 35680 (0.0018)
[2023-09-22 12:40:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18284544. Throughput: 0: 786.7, 1: 785.8. Samples: 4568186. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:32,209][36967] Avg episode reward: [(0, '20.880'), (1, '24.860')]
[2023-09-22 12:40:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18317312. Throughput: 0: 781.4, 1: 783.0. Samples: 4577295. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:40:37,210][36967] Avg episode reward: [(0, '21.010'), (1, '25.820')]
[2023-09-22 12:40:42,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18341888. Throughput: 0: 786.6, 1: 786.2. Samples: 4586763. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:40:42,210][36967] Avg episode reward: [(0, '19.730'), (1, '27.050')]
[2023-09-22 12:40:42,444][38126] Updated weights for policy 0, policy_version 35840 (0.0016)
[2023-09-22 12:40:42,445][38127] Updated weights for policy 1, policy_version 35840 (0.0016)
[2023-09-22 12:40:47,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18374656. Throughput: 0: 784.2, 1: 786.2. Samples: 4591610. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:40:47,209][36967] Avg episode reward: [(0, '19.640'), (1, '28.150')]
[2023-09-22 12:40:47,210][37891] Saving new best policy, reward=28.150!
[2023-09-22 12:40:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18407424. Throughput: 0: 785.8, 1: 785.7. Samples: 4600977. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:40:52,210][36967] Avg episode reward: [(0, '19.710'), (1, '29.320')]
[2023-09-22 12:40:52,220][37891] Saving new best policy, reward=29.320!
[2023-09-22 12:40:55,566][38127] Updated weights for policy 1, policy_version 36000 (0.0019)
[2023-09-22 12:40:55,566][38126] Updated weights for policy 0, policy_version 36000 (0.0018)
[2023-09-22 12:40:57,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18440192. Throughput: 0: 776.0, 1: 778.8. Samples: 4610061. Policy #0 lag: (min: 6.0, avg: 6.0, max: 6.0)
[2023-09-22 12:40:57,210][36967] Avg episode reward: [(0, '20.340'), (1, '29.800')]
[2023-09-22 12:40:57,211][37891] Saving new best policy, reward=29.800!
[2023-09-22 12:41:02,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18472960. Throughput: 0: 783.4, 1: 781.6. Samples: 4614977. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:41:02,209][36967] Avg episode reward: [(0, '21.820'), (1, '30.020')]
[2023-09-22 12:41:02,210][37891] Saving new best policy, reward=30.020!
[2023-09-22 12:41:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18505728. Throughput: 0: 779.4, 1: 781.4. Samples: 4624390. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:41:07,209][36967] Avg episode reward: [(0, '21.690'), (1, '29.320')]
[2023-09-22 12:41:08,402][38126] Updated weights for policy 0, policy_version 36160 (0.0016)
[2023-09-22 12:41:08,402][38127] Updated weights for policy 1, policy_version 36160 (0.0016)
[2023-09-22 12:41:12,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6212.3, 300 sec: 6248.1). Total num frames: 18530304. Throughput: 0: 783.9, 1: 784.3. Samples: 4634040. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:41:12,210][36967] Avg episode reward: [(0, '21.660'), (1, '28.870')]
[2023-09-22 12:41:17,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18563072. Throughput: 0: 783.1, 1: 784.0. Samples: 4638707. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:41:17,210][36967] Avg episode reward: [(0, '22.210'), (1, '28.500')]
[2023-09-22 12:41:21,626][38127] Updated weights for policy 1, policy_version 36320 (0.0014)
[2023-09-22 12:41:21,627][38126] Updated weights for policy 0, policy_version 36320 (0.0018)
[2023-09-22 12:41:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18595840. Throughput: 0: 785.7, 1: 783.9. Samples: 4647928. Policy #0 lag: (min: 3.0, avg: 3.0, max: 3.0)
[2023-09-22 12:41:22,210][36967] Avg episode reward: [(0, '21.380'), (1, '28.310')]
[2023-09-22 12:41:22,222][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000036320_9297920.pth...
[2023-09-22 12:41:22,223][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000036320_9297920.pth...
[2023-09-22 12:41:22,257][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000033392_8548352.pth
[2023-09-22 12:41:22,258][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000033392_8548352.pth
[2023-09-22 12:41:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18628608. Throughput: 0: 780.5, 1: 783.7. Samples: 4657152. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:27,210][36967] Avg episode reward: [(0, '20.130'), (1, '29.600')]
[2023-09-22 12:41:32,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18661376. Throughput: 0: 780.7, 1: 778.4. Samples: 4661770. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:32,209][36967] Avg episode reward: [(0, '21.760'), (1, '28.580')]
[2023-09-22 12:41:34,670][38126] Updated weights for policy 0, policy_version 36480 (0.0016)
[2023-09-22 12:41:34,670][38127] Updated weights for policy 1, policy_version 36480 (0.0017)
[2023-09-22 12:41:37,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18685952. Throughput: 0: 782.5, 1: 784.4. Samples: 4671488. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:37,210][36967] Avg episode reward: [(0, '21.570'), (1, '29.950')]
[2023-09-22 12:41:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 18718720. Throughput: 0: 787.0, 1: 784.9. Samples: 4680793. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:42,209][36967] Avg episode reward: [(0, '22.460'), (1, '28.560')]
[2023-09-22 12:41:47,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18751488. Throughput: 0: 784.2, 1: 785.6. Samples: 4685618. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:47,209][36967] Avg episode reward: [(0, '24.030'), (1, '29.200')]
[2023-09-22 12:41:47,939][38126] Updated weights for policy 0, policy_version 36640 (0.0016)
[2023-09-22 12:41:47,940][38127] Updated weights for policy 1, policy_version 36640 (0.0018)
[2023-09-22 12:41:52,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.6, 300 sec: 6275.9). Total num frames: 18784256. Throughput: 0: 778.0, 1: 775.5. Samples: 4694299. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:52,209][36967] Avg episode reward: [(0, '24.190'), (1, '28.540')]
[2023-09-22 12:41:57,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18817024. Throughput: 0: 779.0, 1: 778.8. Samples: 4704141. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:41:57,210][36967] Avg episode reward: [(0, '24.890'), (1, '26.480')]
[2023-09-22 12:42:01,106][38127] Updated weights for policy 1, policy_version 36800 (0.0015)
[2023-09-22 12:42:01,106][38126] Updated weights for policy 0, policy_version 36800 (0.0017)
[2023-09-22 12:42:02,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18841600. Throughput: 0: 775.0, 1: 774.0. Samples: 4708412. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:02,209][36967] Avg episode reward: [(0, '22.500'), (1, '24.860')]
[2023-09-22 12:42:07,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18874368. Throughput: 0: 779.7, 1: 779.6. Samples: 4718095. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:07,209][36967] Avg episode reward: [(0, '22.500'), (1, '24.860')]
[2023-09-22 12:42:12,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 18907136. Throughput: 0: 778.6, 1: 776.5. Samples: 4727128. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:12,210][36967] Avg episode reward: [(0, '23.410'), (1, '25.090')]
[2023-09-22 12:42:14,338][38126] Updated weights for policy 0, policy_version 36960 (0.0014)
[2023-09-22 12:42:14,338][38127] Updated weights for policy 1, policy_version 36960 (0.0016)
[2023-09-22 12:42:17,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 18939904. Throughput: 0: 778.8, 1: 778.0. Samples: 4731825. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:42:17,210][36967] Avg episode reward: [(0, '23.810'), (1, '26.260')]
[2023-09-22 12:42:22,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18964480. Throughput: 0: 773.7, 1: 773.7. Samples: 4741120. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:42:22,209][36967] Avg episode reward: [(0, '24.460'), (1, '26.510')]
[2023-09-22 12:42:27,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 18997248. Throughput: 0: 775.0, 1: 774.8. Samples: 4750536. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:42:27,210][36967] Avg episode reward: [(0, '24.040'), (1, '27.280')]
[2023-09-22 12:42:27,464][38126] Updated weights for policy 0, policy_version 37120 (0.0018)
[2023-09-22 12:42:27,464][38127] Updated weights for policy 1, policy_version 37120 (0.0017)
[2023-09-22 12:42:32,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19030016. Throughput: 0: 775.4, 1: 776.3. Samples: 4755443. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:42:32,209][36967] Avg episode reward: [(0, '24.010'), (1, '27.010')]
[2023-09-22 12:42:37,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19062784. Throughput: 0: 783.3, 1: 783.9. Samples: 4764822. Policy #0 lag: (min: 14.0, avg: 14.0, max: 14.0)
[2023-09-22 12:42:37,210][36967] Avg episode reward: [(0, '24.210'), (1, '25.740')]
[2023-09-22 12:42:40,399][38127] Updated weights for policy 1, policy_version 37280 (0.0016)
[2023-09-22 12:42:40,399][38126] Updated weights for policy 0, policy_version 37280 (0.0015)
[2023-09-22 12:42:42,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19095552. Throughput: 0: 777.6, 1: 777.9. Samples: 4774138. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:42,210][36967] Avg episode reward: [(0, '24.390'), (1, '25.530')]
[2023-09-22 12:42:47,209][36967] Fps is (10 sec: 6553.8, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19128320. Throughput: 0: 785.7, 1: 784.0. Samples: 4779047. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:47,209][36967] Avg episode reward: [(0, '23.160'), (1, '25.800')]
[2023-09-22 12:42:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19161088. Throughput: 0: 778.2, 1: 780.3. Samples: 4788228. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:52,210][36967] Avg episode reward: [(0, '22.180'), (1, '26.970')]
[2023-09-22 12:42:53,467][38127] Updated weights for policy 1, policy_version 37440 (0.0016)
[2023-09-22 12:42:53,467][38126] Updated weights for policy 0, policy_version 37440 (0.0015)
[2023-09-22 12:42:57,209][36967] Fps is (10 sec: 5734.2, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19185664. Throughput: 0: 787.2, 1: 786.7. Samples: 4797954. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:42:57,210][36967] Avg episode reward: [(0, '23.860'), (1, '26.250')]
[2023-09-22 12:43:02,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19218432. Throughput: 0: 784.4, 1: 787.6. Samples: 4802561. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:02,209][36967] Avg episode reward: [(0, '24.010'), (1, '27.190')]
[2023-09-22 12:43:06,661][38126] Updated weights for policy 0, policy_version 37600 (0.0016)
[2023-09-22 12:43:06,662][38127] Updated weights for policy 1, policy_version 37600 (0.0016)
[2023-09-22 12:43:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19251200. Throughput: 0: 786.3, 1: 783.1. Samples: 4811741. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:07,209][36967] Avg episode reward: [(0, '23.780'), (1, '27.400')]
[2023-09-22 12:43:12,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19283968. Throughput: 0: 781.6, 1: 784.1. Samples: 4820992. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:12,209][36967] Avg episode reward: [(0, '25.580'), (1, '27.380')]
[2023-09-22 12:43:17,209][36967] Fps is (10 sec: 6553.4, 60 sec: 6280.5, 300 sec: 6275.9). Total num frames: 19316736. Throughput: 0: 781.2, 1: 778.2. Samples: 4825615. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:17,210][36967] Avg episode reward: [(0, '24.300'), (1, '28.690')]
[2023-09-22 12:43:19,966][38127] Updated weights for policy 1, policy_version 37760 (0.0016)
[2023-09-22 12:43:19,966][38126] Updated weights for policy 0, policy_version 37760 (0.0017)
[2023-09-22 12:43:22,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19341312. Throughput: 0: 778.9, 1: 777.7. Samples: 4834872. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:22,209][36967] Avg episode reward: [(0, '23.740'), (1, '26.590')]
[2023-09-22 12:43:22,218][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000037776_9670656.pth...
[2023-09-22 12:43:22,218][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000037776_9670656.pth...
[2023-09-22 12:43:22,254][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000034848_8921088.pth
[2023-09-22 12:43:22,255][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000034848_8921088.pth
[2023-09-22 12:43:27,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19374080. Throughput: 0: 777.1, 1: 776.1. Samples: 4844030. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:27,210][36967] Avg episode reward: [(0, '23.460'), (1, '27.120')]
[2023-09-22 12:43:32,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19406848. Throughput: 0: 776.4, 1: 777.0. Samples: 4848950. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:32,209][36967] Avg episode reward: [(0, '24.330'), (1, '27.370')]
[2023-09-22 12:43:33,032][38126] Updated weights for policy 0, policy_version 37920 (0.0015)
[2023-09-22 12:43:33,032][38127] Updated weights for policy 1, policy_version 37920 (0.0016)
[2023-09-22 12:43:37,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19439616. Throughput: 0: 776.2, 1: 773.8. Samples: 4857981. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:37,210][36967] Avg episode reward: [(0, '25.050'), (1, '26.210')]
[2023-09-22 12:43:42,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19464192. Throughput: 0: 775.8, 1: 775.1. Samples: 4867743. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:42,209][36967] Avg episode reward: [(0, '25.060'), (1, '26.300')]
[2023-09-22 12:43:46,211][38126] Updated weights for policy 0, policy_version 38080 (0.0015)
[2023-09-22 12:43:46,211][38127] Updated weights for policy 1, policy_version 38080 (0.0017)
[2023-09-22 12:43:47,209][36967] Fps is (10 sec: 5734.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19496960. Throughput: 0: 774.7, 1: 773.7. Samples: 4872239. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:47,209][36967] Avg episode reward: [(0, '25.090'), (1, '26.770')]
[2023-09-22 12:43:52,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19529728. Throughput: 0: 776.0, 1: 777.2. Samples: 4881636. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:43:52,209][36967] Avg episode reward: [(0, '26.670'), (1, '26.850')]
[2023-09-22 12:43:57,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.6, 300 sec: 6248.1). Total num frames: 19562496. Throughput: 0: 775.7, 1: 773.8. Samples: 4890720. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:43:57,209][36967] Avg episode reward: [(0, '26.090'), (1, '24.600')]
[2023-09-22 12:43:59,335][38126] Updated weights for policy 0, policy_version 38240 (0.0015)
[2023-09-22 12:43:59,336][38127] Updated weights for policy 1, policy_version 38240 (0.0018)
[2023-09-22 12:44:02,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19595264. Throughput: 0: 777.9, 1: 778.3. Samples: 4895646. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:44:02,210][36967] Avg episode reward: [(0, '23.510'), (1, '23.790')]
[2023-09-22 12:44:07,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19628032. Throughput: 0: 777.3, 1: 780.4. Samples: 4904970. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:44:07,210][36967] Avg episode reward: [(0, '24.210'), (1, '24.230')]
[2023-09-22 12:44:12,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19652608. Throughput: 0: 781.3, 1: 782.1. Samples: 4914381. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:44:12,209][36967] Avg episode reward: [(0, '24.520'), (1, '23.860')]
[2023-09-22 12:44:12,549][38126] Updated weights for policy 0, policy_version 38400 (0.0018)
[2023-09-22 12:44:12,550][38127] Updated weights for policy 1, policy_version 38400 (0.0013)
[2023-09-22 12:44:17,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19685376. Throughput: 0: 778.6, 1: 778.5. Samples: 4919020. Policy #0 lag: (min: 10.0, avg: 10.0, max: 10.0)
[2023-09-22 12:44:17,210][36967] Avg episode reward: [(0, '23.830'), (1, '23.510')]
[2023-09-22 12:44:22,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19718144. Throughput: 0: 779.7, 1: 779.7. Samples: 4928152. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:44:22,210][36967] Avg episode reward: [(0, '23.600'), (1, '23.770')]
[2023-09-22 12:44:25,801][38127] Updated weights for policy 1, policy_version 38560 (0.0017)
[2023-09-22 12:44:25,802][38126] Updated weights for policy 0, policy_version 38560 (0.0017)
[2023-09-22 12:44:27,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19750912. Throughput: 0: 776.0, 1: 776.8. Samples: 4937617. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:44:27,210][36967] Avg episode reward: [(0, '23.440'), (1, '24.240')]
[2023-09-22 12:44:32,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19775488. Throughput: 0: 776.0, 1: 774.2. Samples: 4941998. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:44:32,209][36967] Avg episode reward: [(0, '24.360'), (1, '23.550')]
[2023-09-22 12:44:37,209][36967] Fps is (10 sec: 5734.5, 60 sec: 6144.0, 300 sec: 6248.1). Total num frames: 19808256. Throughput: 0: 776.6, 1: 776.2. Samples: 4951514. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:44:37,209][36967] Avg episode reward: [(0, '24.210'), (1, '23.110')]
[2023-09-22 12:44:38,974][38127] Updated weights for policy 1, policy_version 38720 (0.0017)
[2023-09-22 12:44:38,974][38126] Updated weights for policy 0, policy_version 38720 (0.0018)
[2023-09-22 12:44:42,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19841024. Throughput: 0: 778.6, 1: 777.8. Samples: 4960760. Policy #0 lag: (min: 7.0, avg: 7.0, max: 7.0)
[2023-09-22 12:44:42,209][36967] Avg episode reward: [(0, '24.890'), (1, '22.860')]
[2023-09-22 12:44:47,209][36967] Fps is (10 sec: 6553.5, 60 sec: 6280.5, 300 sec: 6248.1). Total num frames: 19873792. Throughput: 0: 776.2, 1: 777.2. Samples: 4965552. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:44:47,210][36967] Avg episode reward: [(0, '24.470'), (1, '23.490')]
[2023-09-22 12:44:52,209][36967] Fps is (10 sec: 5734.3, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19898368. Throughput: 0: 773.5, 1: 773.7. Samples: 4974592. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:44:52,209][36967] Avg episode reward: [(0, '23.650'), (1, '24.110')]
[2023-09-22 12:44:52,283][38126] Updated weights for policy 0, policy_version 38880 (0.0015)
[2023-09-22 12:44:52,283][38127] Updated weights for policy 1, policy_version 38880 (0.0018)
[2023-09-22 12:44:57,209][36967] Fps is (10 sec: 5734.4, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19931136. Throughput: 0: 773.8, 1: 773.9. Samples: 4984029. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:44:57,210][36967] Avg episode reward: [(0, '23.050'), (1, '24.880')]
[2023-09-22 12:45:02,209][36967] Fps is (10 sec: 6553.6, 60 sec: 6144.0, 300 sec: 6220.4). Total num frames: 19963904. Throughput: 0: 775.5, 1: 777.1. Samples: 4988889. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:45:02,210][36967] Avg episode reward: [(0, '23.550'), (1, '25.620')]
[2023-09-22 12:45:05,198][38127] Updated weights for policy 1, policy_version 39040 (0.0015)
[2023-09-22 12:45:05,200][38126] Updated weights for policy 0, policy_version 39040 (0.0015)
[2023-09-22 12:45:07,209][36967] Fps is (10 sec: 6553.7, 60 sec: 6144.0, 300 sec: 6234.3). Total num frames: 19996672. Throughput: 0: 778.5, 1: 778.8. Samples: 4998228. Policy #0 lag: (min: 15.0, avg: 15.0, max: 15.0)
[2023-09-22 12:45:07,209][36967] Avg episode reward: [(0, '23.750'), (1, '25.330')]
[2023-09-22 12:45:09,123][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-09-22 12:45:09,123][38131] Stopping RolloutWorker_w3...
[2023-09-22 12:45:09,123][38130] Stopping RolloutWorker_w2...
[2023-09-22 12:45:09,124][38131] Loop rollout_proc3_evt_loop terminating...
[2023-09-22 12:45:09,123][38129] Stopping RolloutWorker_w1...
[2023-09-22 12:45:09,124][38130] Loop rollout_proc2_evt_loop terminating...
[2023-09-22 12:45:09,124][38132] Stopping RolloutWorker_w4...
[2023-09-22 12:45:09,124][37819] Stopping Batcher_0...
[2023-09-22 12:45:09,124][38166] Stopping RolloutWorker_w7...
[2023-09-22 12:45:09,124][38168] Stopping RolloutWorker_w6...
[2023-09-22 12:45:09,124][38129] Loop rollout_proc1_evt_loop terminating...
[2023-09-22 12:45:09,124][38128] Stopping RolloutWorker_w0...
[2023-09-22 12:45:09,124][38166] Loop rollout_proc7_evt_loop terminating...
[2023-09-22 12:45:09,124][38167] Stopping RolloutWorker_w5...
[2023-09-22 12:45:09,124][37819] Loop batcher_evt_loop terminating...
[2023-09-22 12:45:09,124][36967] Component RolloutWorker_w3 stopped!
[2023-09-22 12:45:09,124][38132] Loop rollout_proc4_evt_loop terminating...
[2023-09-22 12:45:09,124][38168] Loop rollout_proc6_evt_loop terminating...
[2023-09-22 12:45:09,124][38128] Loop rollout_proc0_evt_loop terminating...
[2023-09-22 12:45:09,125][38167] Loop rollout_proc5_evt_loop terminating...
[2023-09-22 12:45:09,125][36967] Component RolloutWorker_w2 stopped!
[2023-09-22 12:45:09,126][36967] Component RolloutWorker_w1 stopped!
[2023-09-22 12:45:09,126][36967] Component RolloutWorker_w6 stopped!
[2023-09-22 12:45:09,127][36967] Component RolloutWorker_w4 stopped!
[2023-09-22 12:45:09,127][36967] Component Batcher_0 stopped!
[2023-09-22 12:45:09,128][36967] Component RolloutWorker_w7 stopped!
[2023-09-22 12:45:09,128][36967] Component RolloutWorker_w0 stopped!
[2023-09-22 12:45:09,129][36967] Component RolloutWorker_w5 stopped!
[2023-09-22 12:45:09,129][36967] Component Batcher_1 stopped!
[2023-09-22 12:45:09,129][37891] Stopping Batcher_1...
[2023-09-22 12:45:09,155][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-09-22 12:45:09,166][38127] Weights refcount: 2 0
[2023-09-22 12:45:09,154][37891] Loop batcher_evt_loop terminating...
[2023-09-22 12:45:09,167][38127] Stopping InferenceWorker_p1-w0...
[2023-09-22 12:45:09,167][38127] Loop inference_proc1-0_evt_loop terminating...
[2023-09-22 12:45:09,167][36967] Component InferenceWorker_p1-w0 stopped!
[2023-09-22 12:45:09,168][37891] Removing ./train_atari/Assault/checkpoint_p1/checkpoint_000036320_9297920.pth
[2023-09-22 12:45:09,170][38126] Weights refcount: 2 0
[2023-09-22 12:45:09,171][38126] Stopping InferenceWorker_p0-w0...
[2023-09-22 12:45:09,171][38126] Loop inference_proc0-0_evt_loop terminating...
[2023-09-22 12:45:09,171][36967] Component InferenceWorker_p0-w0 stopped!
[2023-09-22 12:45:09,174][37891] Saving ./train_atari/Assault/checkpoint_p1/checkpoint_000039088_10006528.pth...
[2023-09-22 12:45:09,185][37819] Removing ./train_atari/Assault/checkpoint_p0/checkpoint_000036320_9297920.pth
[2023-09-22 12:45:09,189][37819] Saving ./train_atari/Assault/checkpoint_p0/checkpoint_000039088_10006528.pth...
[2023-09-22 12:45:09,226][37819] Stopping LearnerWorker_p0...
[2023-09-22 12:45:09,226][37819] Loop learner_proc0_evt_loop terminating...
[2023-09-22 12:45:09,226][36967] Component LearnerWorker_p0 stopped!
[2023-09-22 12:45:09,228][37891] Stopping LearnerWorker_p1...
[2023-09-22 12:45:09,228][36967] Component LearnerWorker_p1 stopped!
[2023-09-22 12:45:09,228][37891] Loop learner_proc1_evt_loop terminating...
[2023-09-22 12:45:09,228][36967] Waiting for process learner_proc0 to stop...
[2023-09-22 12:45:09,892][36967] Waiting for process learner_proc1 to stop...
[2023-09-22 12:45:09,921][36967] Waiting for process inference_proc0-0 to join...
[2023-09-22 12:45:09,922][36967] Waiting for process inference_proc1-0 to join...
[2023-09-22 12:45:09,922][36967] Waiting for process rollout_proc0 to join...
[2023-09-22 12:45:09,923][36967] Waiting for process rollout_proc1 to join...
[2023-09-22 12:45:09,924][36967] Waiting for process rollout_proc2 to join...
[2023-09-22 12:45:09,924][36967] Waiting for process rollout_proc3 to join...
[2023-09-22 12:45:09,925][36967] Waiting for process rollout_proc4 to join...
[2023-09-22 12:45:09,925][36967] Waiting for process rollout_proc5 to join...
[2023-09-22 12:45:09,926][36967] Waiting for process rollout_proc6 to join...
[2023-09-22 12:45:09,926][36967] Waiting for process rollout_proc7 to join...
[2023-09-22 12:45:09,927][36967] Batcher 0 profile tree view:
batching: 20.4865, releasing_batches: 1.7410
[2023-09-22 12:45:09,927][36967] Batcher 1 profile tree view:
batching: 20.3992, releasing_batches: 1.9040
[2023-09-22 12:45:09,928][36967] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0051
wait_policy_total: 637.4509
update_model: 37.9905
weight_update: 0.0015
one_step: 0.0013
handle_policy_step: 2336.8938
deserialize: 68.6114, stack: 16.4079, obs_to_device_normalize: 565.0828, forward: 1131.8403, send_messages: 96.9807
prepare_outputs: 311.6123
to_cpu: 158.3211
[2023-09-22 12:45:09,928][36967] InferenceWorker_p1-w0 profile tree view:
wait_policy: 0.0051
wait_policy_total: 657.9875
update_model: 36.9281
weight_update: 0.0014
one_step: 0.0012
handle_policy_step: 2316.5319
deserialize: 68.3731, stack: 16.6694, obs_to_device_normalize: 559.0204, forward: 1119.0173, send_messages: 96.6365
prepare_outputs: 309.6070
to_cpu: 154.8763
[2023-09-22 12:45:09,929][36967] Learner 0 profile tree view:
misc: 0.0138, prepare_batch: 31.4707
train: 457.0589
epoch_init: 0.1111, minibatch_init: 3.5151, losses_postprocess: 59.2353, kl_divergence: 5.8397, after_optimizer: 20.4406
calculate_losses: 49.4068
losses_init: 0.1136, forward_head: 15.6867, bptt_initial: 0.4823, bptt: 0.5015, tail: 11.3883, advantages_returns: 3.3442, losses: 13.9577
update: 314.0711
clip: 165.6699
[2023-09-22 12:45:09,929][36967] Learner 1 profile tree view:
misc: 0.0148, prepare_batch: 31.1342
train: 458.0873
epoch_init: 0.1103, minibatch_init: 3.5189, losses_postprocess: 58.7141, kl_divergence: 5.8769, after_optimizer: 20.3300
calculate_losses: 49.6691
losses_init: 0.1141, forward_head: 15.8471, bptt_initial: 0.4578, bptt: 0.5600, tail: 11.3186, advantages_returns: 3.3872, losses: 14.0268
update: 315.4198
clip: 167.7162
[2023-09-22 12:45:09,929][36967] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3943, enqueue_policy_requests: 46.3505, env_step: 1060.9142, overhead: 32.2719, complete_rollouts: 1.0905
save_policy_outputs: 59.0750
split_output_tensors: 20.2698
[2023-09-22 12:45:09,930][36967] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3873, enqueue_policy_requests: 44.9708, env_step: 1050.8589, overhead: 30.8507, complete_rollouts: 1.1124
save_policy_outputs: 57.3179
split_output_tensors: 19.6522
[2023-09-22 12:45:09,930][36967] Loop Runner_EvtLoop terminating...
[2023-09-22 12:45:09,931][36967] Runner profile tree view:
main_loop: 3223.3925
[2023-09-22 12:45:09,931][36967] Collected {0: 10006528, 1: 10006528}, FPS: 6208.7