chirag2706

gpt2 codegeneration model added

b4567c3 about 5 years ago

434 kB

	2020-06-11 20:32:12,836 - crisis_transformers.trainer - INFO - Use pytorch device: cuda, with gpu_number=2
	2020-06-11 20:32:14,855 - crisis_transformers.trainer - INFO - Warmup-steps: 55716
	2020-06-11 20:32:14,856 - crisis_transformers.trainer - INFO - *** Running training ***
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Num of training examples (actually iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Steps per Epoch = 17411 or iterations per epoch = 17411
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Num of Epochs = 16
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Best score (perplexity) = -inf
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Eval every 400 steps or every 400 iterations
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Early stop = 20
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Total optimization steps = 278576
	2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 20:39:27,466 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=400
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Best score (perplexity) = -96754087231488.0
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Steps = 400/278576
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - dev_loss = 32.203194 \|\| dev_eval_scores = {'perplexity': 96754087231488.0}
	2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - train_loss = 34.87022018432617
	2020-06-11 20:39:28,627 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 20:46:40,538 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=800
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Best score (perplexity) = -11848.5537109375
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - Steps = 800/278576
	2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - dev_loss = 9.379961 \|\| dev_eval_scores = {'perplexity': 11848.5537109375}
	2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - train_loss = 22.25151824951172
	2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 20:53:56,085 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=1200
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Best score (perplexity) = -82.30923461914062
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Steps = 1200/278576
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - dev_loss = 4.410483 \|\| dev_eval_scores = {'perplexity': 82.30923461914062}
	2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO - train_loss = 15.844202995300293
	2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Best score (perplexity) = -82.30923461914062
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Steps = 1600/278576
	2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - dev_loss = 4.528600 \|\| dev_eval_scores = {'perplexity': 92.62876892089844}
	2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - train_loss = 12.387160301208496
	2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:08:22,662 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2000
	2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Best score (perplexity) = -13.568625450134277
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Steps = 2000/278576
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - dev_loss = 2.607760 \|\| dev_eval_scores = {'perplexity': 13.568625450134277}
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - train_loss = 10.256034851074219
	2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:15:38,560 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2400
	2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Best score (perplexity) = -8.842060089111328
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Steps = 2400/278576
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - dev_loss = 2.179520 \|\| dev_eval_scores = {'perplexity': 8.842060089111328}
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - train_loss = 8.810672760009766
	2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:22:54,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2800
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Best score (perplexity) = -6.2656636238098145
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Steps = 2800/278576
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - dev_loss = 1.835085 \|\| dev_eval_scores = {'perplexity': 6.2656636238098145}
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - train_loss = 7.770569324493408
	2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Best score (perplexity) = -6.2656636238098145
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Steps = 3200/278576
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - dev_loss = 1.864289 \|\| dev_eval_scores = {'perplexity': 6.451348781585693}
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - train_loss = 6.985172748565674
	2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:37:22,915 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=3600
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Best score (perplexity) = -4.507174015045166
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Steps = 3600/278576
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - dev_loss = 1.505670 \|\| dev_eval_scores = {'perplexity': 4.507174015045166}
	2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO - train_loss = 6.368422985076904
	2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:44:38,761 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4000
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Best score (perplexity) = -4.046299457550049
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Steps = 4000/278576
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - dev_loss = 1.397803 \|\| dev_eval_scores = {'perplexity': 4.046299457550049}
	2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - train_loss = 5.87202262878418
	2020-06-11 21:44:42,510 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:51:54,477 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:51:58,342 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4400
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.7213120460510254
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Steps = 4400/278576
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - dev_loss = 1.314076 \|\| dev_eval_scores = {'perplexity': 3.7213120460510254}
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - train_loss = 5.462299346923828
	2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 21:59:10,274 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4800
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.609790325164795
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Steps = 4800/278576
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - dev_loss = 1.283650 \|\| dev_eval_scores = {'perplexity': 3.609790325164795}
	2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO - train_loss = 5.118622303009033
	2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:06:26,200 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=5200
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.5901994705200195
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Steps = 5200/278576
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - dev_loss = 1.278208 \|\| dev_eval_scores = {'perplexity': 3.5901994705200195}
	2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO - train_loss = 4.828340530395508
	2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:13:41,909 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=5600
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.394659996032715
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Steps = 5600/278576
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - dev_loss = 1.222204 \|\| dev_eval_scores = {'perplexity': 3.394659996032715}
	2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO - train_loss = 4.5767903327941895
	2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.394659996032715
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Steps = 6000/278576
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - dev_loss = 1.227896 \|\| dev_eval_scores = {'perplexity': 3.41403865814209}
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - train_loss = 4.356290340423584
	2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:28:09,137 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=6400
	2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.263387680053711
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Steps = 6400/278576
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - dev_loss = 1.182766 \|\| dev_eval_scores = {'perplexity': 3.263387680053711}
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - train_loss = 4.162957668304443
	2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.263387680053711
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Steps = 6800/278576
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - dev_loss = 1.188739 \|\| dev_eval_scores = {'perplexity': 3.2829389572143555}
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - train_loss = 3.991642713546753
	2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:42:37,188 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=7200
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.1454687118530273
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - Steps = 7200/278576
	2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - dev_loss = 1.145963 \|\| dev_eval_scores = {'perplexity': 3.1454687118530273}
	2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - train_loss = 3.8384640216827393
	2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:49:53,052 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=7600
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.0853919982910156
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Steps = 7600/278576
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - dev_loss = 1.126679 \|\| dev_eval_scores = {'perplexity': 3.0853919982910156}
	2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO - train_loss = 3.701063871383667
	2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 22:57:08,463 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:57:12,524 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8000
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.04101824760437
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Steps = 8000/278576
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - dev_loss = 1.112192 \|\| dev_eval_scores = {'perplexity': 3.04101824760437}
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - train_loss = 3.575878143310547
	2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:04:23,771 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8400
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.996488571166992
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Steps = 8400/278576
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - dev_loss = 1.097441 \|\| dev_eval_scores = {'perplexity': 2.996488571166992}
	2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO - train_loss = 3.4622840881347656
	2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:11:39,153 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:11:43,019 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8800
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.9609262943267822
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Steps = 8800/278576
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - dev_loss = 1.085502 \|\| dev_eval_scores = {'perplexity': 2.9609262943267822}
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - train_loss = 3.3580634593963623
	2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:18:54,591 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=9200
	2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.9230592250823975
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Steps = 9200/278576
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - dev_loss = 1.072631 \|\| dev_eval_scores = {'perplexity': 2.9230592250823975}
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - train_loss = 3.2615966796875
	2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:26:10,441 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=9600
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.886868715286255
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Steps = 9600/278576
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - dev_loss = 1.060172 \|\| dev_eval_scores = {'perplexity': 2.886868715286255}
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - train_loss = 3.1728854179382324
	2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:33:25,705 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10000
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.836120128631592
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Steps = 10000/278576
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - dev_loss = 1.042437 \|\| dev_eval_scores = {'perplexity': 2.836120128631592}
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - train_loss = 3.091641664505005
	2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:40:41,837 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10400
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.8059732913970947
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Steps = 10400/278576
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - dev_loss = 1.031750 \|\| dev_eval_scores = {'perplexity': 2.8059732913970947}
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - train_loss = 3.0152587890625
	2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:47:57,612 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10800
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.772104263305664
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Steps = 10800/278576
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - dev_loss = 1.019607 \|\| dev_eval_scores = {'perplexity': 2.772104263305664}
	2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO - train_loss = 2.9444949626922607
	2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-11 23:55:13,323 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=11200
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.7492218017578125
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Steps = 11200/278576
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - dev_loss = 1.011318 \|\| dev_eval_scores = {'perplexity': 2.7492218017578125}
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - train_loss = 2.8781516551971436
	2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:02:29,325 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=11600
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.7139976024627686
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Steps = 11600/278576
	2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - dev_loss = 0.998423 \|\| dev_eval_scores = {'perplexity': 2.7139976024627686}
	2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - train_loss = 2.816199541091919
	2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:09:45,484 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12000
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.68788480758667
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Steps = 12000/278576
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - dev_loss = 0.988755 \|\| dev_eval_scores = {'perplexity': 2.68788480758667}
	2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO - train_loss = 2.7574315071105957
	2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:17:00,535 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12400
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.654526710510254
	2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Steps = 12400/278576
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - dev_loss = 0.976266 \|\| dev_eval_scores = {'perplexity': 2.654526710510254}
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - train_loss = 2.7026596069335938
	2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:24:16,075 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12800
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.6282081604003906
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - Steps = 12800/278576
	2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - dev_loss = 0.966302 \|\| dev_eval_scores = {'perplexity': 2.6282081604003906}
	2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - train_loss = 2.650250196456909
	2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:31:31,505 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=13200
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.6085095405578613
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Steps = 13200/278576
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - dev_loss = 0.958779 \|\| dev_eval_scores = {'perplexity': 2.6085095405578613}
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - train_loss = 2.600867986679077
	2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:38:46,576 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:38:50,406 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=13600
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.572343587875366
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Steps = 13600/278576
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - dev_loss = 0.944817 \|\| dev_eval_scores = {'perplexity': 2.572343587875366}
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - train_loss = 2.55434513092041
	2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:46:01,097 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14000
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.5420730113983154
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Steps = 14000/278576
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - dev_loss = 0.932980 \|\| dev_eval_scores = {'perplexity': 2.5420730113983154}
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - train_loss = 2.509788751602173
	2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 00:53:16,943 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14400
	2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.508371114730835
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Steps = 14400/278576
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - dev_loss = 0.919634 \|\| dev_eval_scores = {'perplexity': 2.508371114730835}
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - train_loss = 2.4678986072540283
	2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:00:31,540 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14800
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4876623153686523
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Steps = 14800/278576
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - dev_loss = 0.911343 \|\| dev_eval_scores = {'perplexity': 2.4876623153686523}
	2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - train_loss = 2.428267002105713
	2020-06-12 01:00:35,780 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:07:47,187 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=15200
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.46309757232666
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Steps = 15200/278576
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - dev_loss = 0.901420 \|\| dev_eval_scores = {'perplexity': 2.46309757232666}
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - train_loss = 2.3902571201324463
	2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:15:02,762 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=15600
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4311938285827637
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Steps = 15600/278576
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - dev_loss = 0.888382 \|\| dev_eval_scores = {'perplexity': 2.4311938285827637}
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - train_loss = 2.354424238204956
	2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:22:18,392 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16000
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4099924564361572
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Steps = 16000/278576
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - dev_loss = 0.879624 \|\| dev_eval_scores = {'perplexity': 2.4099924564361572}
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - train_loss = 2.319091796875
	2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:29:33,829 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:29:37,786 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16400
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3866090774536133
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Steps = 16400/278576
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - dev_loss = 0.869874 \|\| dev_eval_scores = {'perplexity': 2.3866090774536133}
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - train_loss = 2.285768747329712
	2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:36:48,893 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16800
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3575546741485596
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Steps = 16800/278576
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - dev_loss = 0.857625 \|\| dev_eval_scores = {'perplexity': 2.3575546741485596}
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - train_loss = 2.253655433654785
	2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:44:04,158 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=17200
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3273518085479736
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Epoch = 1/16
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Steps = 17200/278576
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - dev_loss = 0.844731 \|\| dev_eval_scores = {'perplexity': 2.3273518085479736}
	2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO - train_loss = 2.22310733795166
	2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:46:08,626 - crisis_transformers.trainer - INFO - epoch 1 ends, 15 epoches left
	2020-06-12 01:46:08,628 - crisis_transformers.trainer - INFO -
	global_average_loss=2.207577705383301,global_steps=17411 on training set
	2020-06-12 01:51:18,826 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=189
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.306220054626465
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Steps = 17600/278576
	2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - dev_loss = 0.835610 \|\| dev_eval_scores = {'perplexity': 2.306220054626465}
	2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - train_loss = 0.8919339776039124
	2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 01:58:34,239 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=589
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.282672166824341
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Steps = 18000/278576
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - dev_loss = 0.825347 \|\| dev_eval_scores = {'perplexity': 2.282672166824341}
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - train_loss = 0.9102271199226379
	2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:05:49,927 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:05:53,796 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=989
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.261589288711548
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Steps = 18400/278576
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - dev_loss = 0.816068 \|\| dev_eval_scores = {'perplexity': 2.261589288711548}
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - train_loss = 0.9082818031311035
	2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:13:04,838 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=1389
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.2358696460723877
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Steps = 18800/278576
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - dev_loss = 0.804630 \|\| dev_eval_scores = {'perplexity': 2.2358696460723877}
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - train_loss = 0.9062302708625793
	2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:20:19,901 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=1789
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.210675001144409
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Steps = 19200/278576
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - dev_loss = 0.793298 \|\| dev_eval_scores = {'perplexity': 2.210675001144409}
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - train_loss = 0.8982809782028198
	2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:27:35,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2189
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.1900641918182373
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Steps = 19600/278576
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - dev_loss = 0.783931 \|\| dev_eval_scores = {'perplexity': 2.1900641918182373}
	2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO - train_loss = 0.8898405432701111
	2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:34:50,458 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2589
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.169043779373169
	2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Steps = 20000/278576
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - dev_loss = 0.774286 \|\| dev_eval_scores = {'perplexity': 2.169043779373169}
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - train_loss = 0.8851494789123535
	2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:42:06,145 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2989
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.140289783477783
	2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Steps = 20400/278576
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - dev_loss = 0.760941 \|\| dev_eval_scores = {'perplexity': 2.140289783477783}
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - train_loss = 0.8793467283248901
	2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:49:21,789 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=3389
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.121040105819702
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Steps = 20800/278576
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - dev_loss = 0.751907 \|\| dev_eval_scores = {'perplexity': 2.121040105819702}
	2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO - train_loss = 0.871684730052948
	2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 02:56:37,107 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:56:41,332 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=3789
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0992023944854736
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Steps = 21200/278576
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - dev_loss = 0.741557 \|\| dev_eval_scores = {'perplexity': 2.0992023944854736}
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - train_loss = 0.8670705556869507
	2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:03:52,858 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4189
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0718014240264893
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Steps = 21600/278576
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - dev_loss = 0.728418 \|\| dev_eval_scores = {'perplexity': 2.0718014240264893}
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - train_loss = 0.8624909520149231
	2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:11:08,681 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4589
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0530827045440674
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Steps = 22000/278576
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - dev_loss = 0.719342 \|\| dev_eval_scores = {'perplexity': 2.0530827045440674}
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - train_loss = 0.8574584126472473
	2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:18:23,725 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:18:27,499 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4989
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0336155891418457
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Steps = 22400/278576
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - dev_loss = 0.709815 \|\| dev_eval_scores = {'perplexity': 2.0336155891418457}
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - train_loss = 0.8504697680473328
	2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:25:39,057 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=5389
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0098047256469727
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Steps = 22800/278576
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - dev_loss = 0.698038 \|\| dev_eval_scores = {'perplexity': 2.0098047256469727}
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - train_loss = 0.8460281491279602
	2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:32:54,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:32:58,726 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=5789
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9868273735046387
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Steps = 23200/278576
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - dev_loss = 0.686539 \|\| dev_eval_scores = {'perplexity': 1.9868273735046387}
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - train_loss = 0.8407670855522156
	2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:40:09,514 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6189
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9708765745162964
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Steps = 23600/278576
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - dev_loss = 0.678478 \|\| dev_eval_scores = {'perplexity': 1.9708765745162964}
	2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO - train_loss = 0.8364319205284119
	2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:47:25,312 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6589
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.950257420539856
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Steps = 24000/278576
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - dev_loss = 0.667961 \|\| dev_eval_scores = {'perplexity': 1.950257420539856}
	2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO - train_loss = 0.8307074308395386
	2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 03:54:40,261 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:54:44,220 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6989
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9251375198364258
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Steps = 24400/278576
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - dev_loss = 0.654997 \|\| dev_eval_scores = {'perplexity': 1.9251375198364258}
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - train_loss = 0.8253822922706604
	2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:01:55,402 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:01:59,672 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=7389
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9091284275054932
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Steps = 24800/278576
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - dev_loss = 0.646647 \|\| dev_eval_scores = {'perplexity': 1.9091284275054932}
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - train_loss = 0.8205176591873169
	2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:09:10,549 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=7789
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8884336948394775
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Steps = 25200/278576
	2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - dev_loss = 0.635748 \|\| dev_eval_scores = {'perplexity': 1.8884336948394775}
	2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - train_loss = 0.8162734508514404
	2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:16:26,145 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8189
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8698856830596924
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Steps = 25600/278576
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - dev_loss = 0.625877 \|\| dev_eval_scores = {'perplexity': 1.8698856830596924}
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - train_loss = 0.811458170413971
	2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:23:41,295 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8589
	2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8487507104873657
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Steps = 26000/278576
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - dev_loss = 0.614510 \|\| dev_eval_scores = {'perplexity': 1.8487507104873657}
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - train_loss = 0.8066449761390686
	2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:30:56,930 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8989
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8316272497177124
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Steps = 26400/278576
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - dev_loss = 0.605205 \|\| dev_eval_scores = {'perplexity': 1.8316272497177124}
	2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO - train_loss = 0.802518904209137
	2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:38:12,636 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=9389
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8143984079360962
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Steps = 26800/278576
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - dev_loss = 0.595754 \|\| dev_eval_scores = {'perplexity': 1.8143984079360962}
	2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO - train_loss = 0.7970281839370728
	2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:45:27,487 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=9789
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7961244583129883
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Steps = 27200/278576
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - dev_loss = 0.585631 \|\| dev_eval_scores = {'perplexity': 1.7961244583129883}
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - train_loss = 0.7928464412689209
	2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:52:42,555 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10189
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7843670845031738
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - Steps = 27600/278576
	2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - dev_loss = 0.579064 \|\| dev_eval_scores = {'perplexity': 1.7843670845031738}
	2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - train_loss = 0.7886567711830139
	2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 04:59:57,976 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10589
	2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.764768123626709
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Steps = 28000/278576
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - dev_loss = 0.568019 \|\| dev_eval_scores = {'perplexity': 1.764768123626709}
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - train_loss = 0.7840659618377686
	2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:07:12,938 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10989
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7483080625534058
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Steps = 28400/278576
	2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - dev_loss = 0.558648 \|\| dev_eval_scores = {'perplexity': 1.7483080625534058}
	2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - train_loss = 0.7798082828521729
	2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:14:27,901 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=11389
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7391985654830933
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Steps = 28800/278576
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - dev_loss = 0.553424 \|\| dev_eval_scores = {'perplexity': 1.7391985654830933}
	2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO - train_loss = 0.7753340601921082
	2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:21:43,000 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=11789
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7198398113250732
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Steps = 29200/278576
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - dev_loss = 0.542231 \|\| dev_eval_scores = {'perplexity': 1.7198398113250732}
	2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO - train_loss = 0.7709231972694397
	2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:28:58,330 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:29:02,113 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12189
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6953392028808594
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Steps = 29600/278576
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - dev_loss = 0.527883 \|\| dev_eval_scores = {'perplexity': 1.6953392028808594}
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - train_loss = 0.7663020491600037
	2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:36:13,699 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12589
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.684487223625183
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Steps = 30000/278576
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - dev_loss = 0.521461 \|\| dev_eval_scores = {'perplexity': 1.684487223625183}
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - train_loss = 0.7620025277137756
	2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:43:29,168 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12989
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6656123399734497
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - Steps = 30400/278576
	2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - dev_loss = 0.510193 \|\| dev_eval_scores = {'perplexity': 1.6656123399734497}
	2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - train_loss = 0.7573661208152771
	2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:50:44,231 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=13389
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.651535987854004
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Steps = 30800/278576
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - dev_loss = 0.501706 \|\| dev_eval_scores = {'perplexity': 1.651535987854004}
	2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO - train_loss = 0.7526668906211853
	2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 05:57:59,864 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:58:03,697 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=13789
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6361221075057983
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Steps = 31200/278576
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - dev_loss = 0.492329 \|\| dev_eval_scores = {'perplexity': 1.6361221075057983}
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - train_loss = 0.7481159567832947
	2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:05:14,807 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:05:18,556 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14189
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6220632791519165
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Steps = 31600/278576
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - dev_loss = 0.483699 \|\| dev_eval_scores = {'perplexity': 1.6220632791519165}
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - train_loss = 0.7438127994537354
	2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:12:30,611 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14589
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.610316276550293
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Steps = 32000/278576
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - dev_loss = 0.476431 \|\| dev_eval_scores = {'perplexity': 1.610316276550293}
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - train_loss = 0.7390771508216858
	2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:19:45,562 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14989
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5992158651351929
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Steps = 32400/278576
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - dev_loss = 0.469513 \|\| dev_eval_scores = {'perplexity': 1.5992158651351929}
	2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO - train_loss = 0.7345605492591858
	2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:27:00,549 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=15389
	2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.581007480621338
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Steps = 32800/278576
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - dev_loss = 0.458062 \|\| dev_eval_scores = {'perplexity': 1.581007480621338}
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - train_loss = 0.7301263809204102
	2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:34:16,032 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=15789
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.571539044380188
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Steps = 33200/278576
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - dev_loss = 0.452055 \|\| dev_eval_scores = {'perplexity': 1.571539044380188}
	2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO - train_loss = 0.7254016399383545
	2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:41:30,980 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16189
	2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5561637878417969
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Steps = 33600/278576
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - dev_loss = 0.442224 \|\| dev_eval_scores = {'perplexity': 1.5561637878417969}
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - train_loss = 0.7213504314422607
	2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:48:46,476 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16589
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5447299480438232
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Steps = 34000/278576
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - dev_loss = 0.434849 \|\| dev_eval_scores = {'perplexity': 1.5447299480438232}
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - train_loss = 0.7172350883483887
	2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 06:56:02,163 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16989
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5317991971969604
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Steps = 34400/278576
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - dev_loss = 0.426443 \|\| dev_eval_scores = {'perplexity': 1.5317991971969604}
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - train_loss = 0.7126625180244446
	2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:03:17,948 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=17389
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5181857347488403
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Epoch = 2/16
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Steps = 34800/278576
	2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - dev_loss = 0.417516 \|\| dev_eval_scores = {'perplexity': 1.5181857347488403}
	2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - train_loss = 0.7084499001502991
	2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:03:34,765 - crisis_transformers.trainer - INFO - epoch 2 ends, 14 epoches left
	2020-06-12 07:03:34,767 - crisis_transformers.trainer - INFO -
	global_average_loss=1.457897663116455,global_steps=34822 on training set
	2020-06-12 07:10:33,578 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=378
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5065593719482422
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Steps = 35200/278576
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - dev_loss = 0.409828 \|\| dev_eval_scores = {'perplexity': 1.5065593719482422}
	2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO - train_loss = 0.5077053904533386
	2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:17:48,578 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:17:52,866 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=778
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4995406866073608
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Steps = 35600/278576
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - dev_loss = 0.405159 \|\| dev_eval_scores = {'perplexity': 1.4995406866073608}
	2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO - train_loss = 0.5084229111671448
	2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:25:04,279 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1178
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4872630834579468
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Steps = 36000/278576
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - dev_loss = 0.396938 \|\| dev_eval_scores = {'perplexity': 1.4872630834579468}
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - train_loss = 0.5008477568626404
	2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:32:19,554 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1578
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4724860191345215
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Steps = 36400/278576
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - dev_loss = 0.386952 \|\| dev_eval_scores = {'perplexity': 1.4724860191345215}
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - train_loss = 0.49597886204719543
	2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:39:35,241 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1978
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4704307317733765
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Steps = 36800/278576
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - dev_loss = 0.385555 \|\| dev_eval_scores = {'perplexity': 1.4704307317733765}
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - train_loss = 0.4925973117351532
	2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:46:50,212 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=2378
	2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4530060291290283
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Steps = 37200/278576
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - dev_loss = 0.373635 \|\| dev_eval_scores = {'perplexity': 1.4530060291290283}
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - train_loss = 0.4874938130378723
	2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 07:54:05,635 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=2778
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4396711587905884
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Steps = 37600/278576
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - dev_loss = 0.364415 \|\| dev_eval_scores = {'perplexity': 1.4396711587905884}
	2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO - train_loss = 0.4848006069660187
	2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:01:21,053 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3178
	2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4313539266586304
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Steps = 38000/278576
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - dev_loss = 0.358621 \|\| dev_eval_scores = {'perplexity': 1.4313539266586304}
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - train_loss = 0.4819473922252655
	2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:08:36,819 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3578
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4217482805252075
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Steps = 38400/278576
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - dev_loss = 0.351887 \|\| dev_eval_scores = {'perplexity': 1.4217482805252075}
	2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO - train_loss = 0.47801968455314636
	2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:15:51,675 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3978
	2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4130357503890991
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Steps = 38800/278576
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - dev_loss = 0.345740 \|\| dev_eval_scores = {'perplexity': 1.4130357503890991}
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - train_loss = 0.4750809073448181
	2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:23:07,471 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=4378
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4030141830444336
	2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Steps = 39200/278576
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - dev_loss = 0.338623 \|\| dev_eval_scores = {'perplexity': 1.4030141830444336}
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - train_loss = 0.4713905453681946
	2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:30:22,512 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=4778
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.391721487045288
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Steps = 39600/278576
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - dev_loss = 0.330542 \|\| dev_eval_scores = {'perplexity': 1.391721487045288}
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - train_loss = 0.4688350260257721
	2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:37:37,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5178
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3863202333450317
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Steps = 40000/278576
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - dev_loss = 0.326653 \|\| dev_eval_scores = {'perplexity': 1.3863202333450317}
	2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO - train_loss = 0.4653063714504242
	2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:44:52,891 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:44:56,707 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5578
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3764472007751465
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Steps = 40400/278576
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - dev_loss = 0.319506 \|\| dev_eval_scores = {'perplexity': 1.3764472007751465}
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - train_loss = 0.4616211950778961
	2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:52:07,568 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5978
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3701869249343872
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Steps = 40800/278576
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - dev_loss = 0.314947 \|\| dev_eval_scores = {'perplexity': 1.3701869249343872}
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - train_loss = 0.4588777422904968
	2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 08:59:22,571 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=6378
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3665746450424194
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Steps = 41200/278576
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - dev_loss = 0.312307 \|\| dev_eval_scores = {'perplexity': 1.3665746450424194}
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - train_loss = 0.4553696811199188
	2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:06:37,730 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=6778
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.35618257522583
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Steps = 41600/278576
	2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - dev_loss = 0.304674 \|\| dev_eval_scores = {'perplexity': 1.35618257522583}
	2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - train_loss = 0.4522298574447632
	2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:13:52,744 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:13:56,458 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7178
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.347740888595581
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Steps = 42000/278576
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - dev_loss = 0.298430 \|\| dev_eval_scores = {'perplexity': 1.347740888595581}
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - train_loss = 0.4496159553527832
	2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:21:07,688 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7578
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3386039733886719
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Steps = 42400/278576
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - dev_loss = 0.291627 \|\| dev_eval_scores = {'perplexity': 1.3386039733886719}
	2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO - train_loss = 0.44657158851623535
	2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:28:22,370 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7978
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3325072526931763
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Steps = 42800/278576
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - dev_loss = 0.287062 \|\| dev_eval_scores = {'perplexity': 1.3325072526931763}
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - train_loss = 0.4432190954685211
	2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:35:37,349 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=8378
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3274871110916138
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Steps = 43200/278576
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - dev_loss = 0.283288 \|\| dev_eval_scores = {'perplexity': 1.3274871110916138}
	2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO - train_loss = 0.4402396082878113
	2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:42:52,444 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=8778
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.315727710723877
	2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Steps = 43600/278576
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - dev_loss = 0.274390 \|\| dev_eval_scores = {'perplexity': 1.315727710723877}
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - train_loss = 0.43725401163101196
	2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:50:07,902 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9178
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3152892589569092
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Steps = 44000/278576
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - dev_loss = 0.274057 \|\| dev_eval_scores = {'perplexity': 1.3152892589569092}
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - train_loss = 0.43421682715415955
	2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 09:57:23,789 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9578
	2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.306022047996521
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Steps = 44400/278576
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - dev_loss = 0.266986 \|\| dev_eval_scores = {'perplexity': 1.306022047996521}
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - train_loss = 0.43124547600746155
	2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:04:38,908 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:04:42,860 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9978
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3020490407943726
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Steps = 44800/278576
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - dev_loss = 0.263939 \|\| dev_eval_scores = {'perplexity': 1.3020490407943726}
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - train_loss = 0.42827215790748596
	2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:11:53,676 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=10378
	2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2914035320281982
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Steps = 45200/278576
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - dev_loss = 0.255730 \|\| dev_eval_scores = {'perplexity': 1.2914035320281982}
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - train_loss = 0.4252185821533203
	2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:19:08,790 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=10778
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2862516641616821
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - Steps = 45600/278576
	2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - dev_loss = 0.251732 \|\| dev_eval_scores = {'perplexity': 1.2862516641616821}
	2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - train_loss = 0.4227924942970276
	2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:26:22,943 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11178
	2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.282753825187683
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Steps = 46000/278576
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - dev_loss = 0.249009 \|\| dev_eval_scores = {'perplexity': 1.282753825187683}
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - train_loss = 0.41981765627861023
	2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:33:37,608 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:33:41,331 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11578
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2747212648391724
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Steps = 46400/278576
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - dev_loss = 0.242728 \|\| dev_eval_scores = {'perplexity': 1.2747212648391724}
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - train_loss = 0.41709834337234497
	2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:40:52,557 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11978
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.270970106124878
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Steps = 46800/278576
	2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - dev_loss = 0.239781 \|\| dev_eval_scores = {'perplexity': 1.270970106124878}
	2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - train_loss = 0.41438814997673035
	2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:48:07,184 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=12378
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2681171894073486
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Steps = 47200/278576
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - dev_loss = 0.237533 \|\| dev_eval_scores = {'perplexity': 1.2681171894073486}
	2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO - train_loss = 0.4116062819957733
	2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2681171894073486
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 10s
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Steps = 47600/278576
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - dev_loss = 0.237603 \|\| dev_eval_scores = {'perplexity': 1.268206000328064}
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - train_loss = 0.4090476334095001
	2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:02:33,440 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13178
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2570724487304688
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Steps = 48000/278576
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - dev_loss = 0.228786 \|\| dev_eval_scores = {'perplexity': 1.2570724487304688}
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - train_loss = 0.40632161498069763
	2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:09:48,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:09:51,983 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13578
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.252106785774231
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Steps = 48400/278576
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - dev_loss = 0.224828 \|\| dev_eval_scores = {'perplexity': 1.252106785774231}
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - train_loss = 0.40354153513908386
	2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:17:03,692 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13978
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2468602657318115
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Steps = 48800/278576
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - dev_loss = 0.220629 \|\| dev_eval_scores = {'perplexity': 1.2468602657318115}
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - train_loss = 0.40087953209877014
	2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:24:17,752 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:24:21,750 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=14378
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.241665244102478
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Steps = 49200/278576
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - dev_loss = 0.216453 \|\| dev_eval_scores = {'perplexity': 1.241665244102478}
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - train_loss = 0.3984794318675995
	2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:31:32,613 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:31:36,096 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=14778
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2373579740524292
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Steps = 49600/278576
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - dev_loss = 0.212978 \|\| dev_eval_scores = {'perplexity': 1.2373579740524292}
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - train_loss = 0.3960722088813782
	2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:38:48,257 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15178
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2339407205581665
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Steps = 50000/278576
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - dev_loss = 0.210213 \|\| dev_eval_scores = {'perplexity': 1.2339407205581665}
	2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - train_loss = 0.39352700114250183
	2020-06-12 11:38:51,724 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:46:03,817 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15578
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2312901020050049
	2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Steps = 50400/278576
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - dev_loss = 0.208062 \|\| dev_eval_scores = {'perplexity': 1.2312901020050049}
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - train_loss = 0.39096903800964355
	2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 11:53:18,503 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:53:21,880 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15978
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2268590927124023
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Steps = 50800/278576
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - dev_loss = 0.204457 \|\| dev_eval_scores = {'perplexity': 1.2268590927124023}
	2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO - train_loss = 0.3884601294994354
	2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:00:33,247 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:00:36,497 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=16378
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.224045991897583
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Steps = 51200/278576
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - dev_loss = 0.202162 \|\| dev_eval_scores = {'perplexity': 1.224045991897583}
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - train_loss = 0.38594967126846313
	2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:07:47,479 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:07:51,260 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=16778
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.21811842918396
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Steps = 51600/278576
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - dev_loss = 0.197307 \|\| dev_eval_scores = {'perplexity': 1.21811842918396}
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - train_loss = 0.3834236264228821
	2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:15:01,910 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:15:05,101 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=17178
	2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2158831357955933
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Epoch = 3/16
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Steps = 52000/278576
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - dev_loss = 0.195471 \|\| dev_eval_scores = {'perplexity': 1.2158831357955933}
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - train_loss = 0.3809737265110016
	2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:17:19,046 - crisis_transformers.trainer - INFO - epoch 3 ends, 13 epoches left
	2020-06-12 12:17:19,049 - crisis_transformers.trainer - INFO -
	global_average_loss=1.0984210968017578,global_steps=52233 on training set
	2020-06-12 12:22:16,490 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=167
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2130458354949951
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Steps = 52400/278576
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - dev_loss = 0.193134 \|\| dev_eval_scores = {'perplexity': 1.2130458354949951}
	2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO - train_loss = 0.25673729181289673
	2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:29:30,907 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=567
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.210909128189087
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Steps = 52800/278576
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - dev_loss = 0.191371 \|\| dev_eval_scores = {'perplexity': 1.210909128189087}
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - train_loss = 0.25955724716186523
	2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:36:45,199 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=967
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2072252035140991
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Steps = 53200/278576
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - dev_loss = 0.188325 \|\| dev_eval_scores = {'perplexity': 1.2072252035140991}
	2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - train_loss = 0.2595442533493042
	2020-06-12 12:36:48,920 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:43:59,962 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:44:03,265 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=1367
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2031614780426025
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Steps = 53600/278576
	2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:44:03,282 - crisis_transformers.trainer - INFO - dev_loss = 0.184953 \|\| dev_eval_scores = {'perplexity': 1.2031614780426025}
	2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO - train_loss = 0.25946447253227234
	2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:51:15,360 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:51:18,633 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=1767
	2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Steps = 54000/278576
	2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - dev_loss = 0.184105 \|\| dev_eval_scores = {'perplexity': 1.2021421194076538}
	2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO - train_loss = 0.2583455443382263
	2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Steps = 54400/278576
	2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - dev_loss = 0.184208 \|\| dev_eval_scores = {'perplexity': 1.2022662162780762}
	2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - train_loss = 0.25629064440727234
	2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:05:42,132 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Steps = 54800/278576
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - dev_loss = 0.187595 \|\| dev_eval_scores = {'perplexity': 1.2063448429107666}
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - train_loss = 0.25469452142715454
	2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:12:53,734 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Early stop count = 3/20
	2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Steps = 55200/278576
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - dev_loss = 0.184251 \|\| dev_eval_scores = {'perplexity': 1.2023180723190308}
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - train_loss = 0.25338509678840637
	2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:20:04,638 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:20:07,839 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=3367
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.191463589668274
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:20:07,855 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Steps = 55600/278576
	2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - dev_loss = 0.175182 \|\| dev_eval_scores = {'perplexity': 1.191463589668274}
	2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO - train_loss = 0.2521595358848572
	2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:27:19,652 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:27:22,821 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=3767
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1891900300979614
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Steps = 56000/278576
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - dev_loss = 0.173272 \|\| dev_eval_scores = {'perplexity': 1.1891900300979614}
	2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO - train_loss = 0.251168429851532
	2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1891900300979614
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Steps = 56400/278576
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - dev_loss = 0.174895 \|\| dev_eval_scores = {'perplexity': 1.1911215782165527}
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - train_loss = 0.2495262175798416
	2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:41:45,895 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:41:49,254 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=4567
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1830016374588013
	2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Steps = 56800/278576
	2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - dev_loss = 0.168055 \|\| dev_eval_scores = {'perplexity': 1.1830016374588013}
	2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO - train_loss = 0.24820448458194733
	2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:49:01,184 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:49:04,330 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=4967
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Steps = 57200/278576
	2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - dev_loss = 0.167469 \|\| dev_eval_scores = {'perplexity': 1.1823091506958008}
	2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO - train_loss = 0.24676309525966644
	2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Steps = 57600/278576
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - dev_loss = 0.168122 \|\| dev_eval_scores = {'perplexity': 1.1830805540084839}
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - train_loss = 0.2456163614988327
	2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:03:27,611 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - Steps = 58000/278576
	2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - dev_loss = 0.168360 \|\| dev_eval_scores = {'perplexity': 1.1833629608154297}
	2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - train_loss = 0.24447211623191833
	2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:10:39,369 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:10:42,563 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6167
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1757748126983643
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Steps = 58400/278576
	2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - dev_loss = 0.161927 \|\| dev_eval_scores = {'perplexity': 1.1757748126983643}
	2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO - train_loss = 0.24326415359973907
	2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:17:54,473 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:17:57,675 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6567
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1724531650543213
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Steps = 58800/278576
	2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - dev_loss = 0.159098 \|\| dev_eval_scores = {'perplexity': 1.1724531650543213}
	2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO - train_loss = 0.24194355309009552
	2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:25:09,648 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:25:13,392 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6967
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1723699569702148
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Steps = 59200/278576
	2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - dev_loss = 0.159027 \|\| dev_eval_scores = {'perplexity': 1.1723699569702148}
	2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO - train_loss = 0.24068011343479156
	2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:32:25,725 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:32:28,988 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=7367
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1677402257919312
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Steps = 59600/278576
	2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - dev_loss = 0.155071 \|\| dev_eval_scores = {'perplexity': 1.1677402257919312}
	2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO - train_loss = 0.23944209516048431
	2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1677402257919312
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Steps = 60000/278576
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - dev_loss = 0.160010 \|\| dev_eval_scores = {'perplexity': 1.1735223531723022}
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - train_loss = 0.23837868869304657
	2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:46:52,988 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:46:55,997 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8167
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1659613847732544
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:46:56,014 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Steps = 60400/278576
	2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - dev_loss = 0.153546 \|\| dev_eval_scores = {'perplexity': 1.1659613847732544}
	2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO - train_loss = 0.23715026676654816
	2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 14:54:07,918 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:54:10,976 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8567
	2020-06-12 14:54:10,989 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.165794014930725
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Steps = 60800/278576
	2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 14:54:10,991 - crisis_transformers.trainer - INFO - dev_loss = 0.153402 \|\| dev_eval_scores = {'perplexity': 1.165794014930725}
	2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO - train_loss = 0.23593905568122864
	2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:01:22,795 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:01:25,973 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8967
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.161121129989624
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Steps = 61200/278576
	2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - dev_loss = 0.149386 \|\| dev_eval_scores = {'perplexity': 1.161121129989624}
	2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO - train_loss = 0.2347162663936615
	2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:08:37,941 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:08:41,095 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=9367
	2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.159262776374817
	2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Steps = 61600/278576
	2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - dev_loss = 0.147784 \|\| dev_eval_scores = {'perplexity': 1.159262776374817}
	2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO - train_loss = 0.2334117591381073
	2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:15:53,144 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.159262776374817
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Steps = 62000/278576
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - dev_loss = 0.149117 \|\| dev_eval_scores = {'perplexity': 1.1608085632324219}
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - train_loss = 0.23231545090675354
	2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:23:04,214 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:23:07,484 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=10167
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1586897373199463
	2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Steps = 62400/278576
	2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - dev_loss = 0.147290 \|\| dev_eval_scores = {'perplexity': 1.1586897373199463}
	2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO - train_loss = 0.23095978796482086
	2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:30:19,155 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:30:22,774 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=10567
	2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.155853509902954
	2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Steps = 62800/278576
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - dev_loss = 0.144839 \|\| dev_eval_scores = {'perplexity': 1.155853509902954}
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - train_loss = 0.22986294329166412
	2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.155853509902954
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Steps = 63200/278576
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - dev_loss = 0.146531 \|\| dev_eval_scores = {'perplexity': 1.1578106880187988}
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - train_loss = 0.2286159247159958
	2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:44:45,801 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:44:48,841 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=11367
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1531232595443726
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Steps = 63600/278576
	2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - dev_loss = 0.142474 \|\| dev_eval_scores = {'perplexity': 1.1531232595443726}
	2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO - train_loss = 0.22760188579559326
	2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:52:00,452 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:52:04,180 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=11767
	2020-06-12 15:52:04,195 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1524842977523804
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Steps = 64000/278576
	2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - dev_loss = 0.141920 \|\| dev_eval_scores = {'perplexity': 1.1524842977523804}
	2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO - train_loss = 0.22645704448223114
	2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 15:59:15,356 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:59:18,493 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12167
	2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1517306566238403
	2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Steps = 64400/278576
	2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - dev_loss = 0.141266 \|\| dev_eval_scores = {'perplexity': 1.1517306566238403}
	2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO - train_loss = 0.22518599033355713
	2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:06:30,098 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:06:33,674 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12567
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1506377458572388
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Steps = 64800/278576
	2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - dev_loss = 0.140316 \|\| dev_eval_scores = {'perplexity': 1.1506377458572388}
	2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO - train_loss = 0.2241048812866211
	2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:13:44,939 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:13:48,305 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12967
	2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1486502885818481
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Steps = 65200/278576
	2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - dev_loss = 0.138588 \|\| dev_eval_scores = {'perplexity': 1.1486502885818481}
	2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO - train_loss = 0.22296540439128876
	2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:21:01,926 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:21:05,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=13367
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1468722820281982
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 17s
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Steps = 65600/278576
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - dev_loss = 0.137039 \|\| dev_eval_scores = {'perplexity': 1.1468722820281982}
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - train_loss = 0.2220388650894165
	2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:28:16,835 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:28:20,007 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=13767
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1454410552978516
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Steps = 66000/278576
	2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - dev_loss = 0.135790 \|\| dev_eval_scores = {'perplexity': 1.1454410552978516}
	2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO - train_loss = 0.22106683254241943
	2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:35:31,636 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:35:35,306 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=14167
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1435331106185913
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Steps = 66400/278576
	2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - dev_loss = 0.134123 \|\| dev_eval_scores = {'perplexity': 1.1435331106185913}
	2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO - train_loss = 0.2200937420129776
	2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1435331106185913
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Steps = 66800/278576
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - dev_loss = 0.139419 \|\| dev_eval_scores = {'perplexity': 1.1496059894561768}
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - train_loss = 0.21907271444797516
	2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:49:57,796 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:50:01,478 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=14967
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1425509452819824
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Steps = 67200/278576
	2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - dev_loss = 0.133263 \|\| dev_eval_scores = {'perplexity': 1.1425509452819824}
	2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO - train_loss = 0.2180401235818863
	2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 16:57:12,820 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:57:15,939 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=15367
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1421887874603271
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Steps = 67600/278576
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - dev_loss = 0.132946 \|\| dev_eval_scores = {'perplexity': 1.1421887874603271}
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - train_loss = 0.2170739322900772
	2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1421887874603271
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Steps = 68000/278576
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - dev_loss = 0.133038 \|\| dev_eval_scores = {'perplexity': 1.1422935724258423}
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - train_loss = 0.21610437333583832
	2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:11:38,816 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:11:41,945 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=16167
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Steps = 68400/278576
	2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - dev_loss = 0.130158 \|\| dev_eval_scores = {'perplexity': 1.1390084028244019}
	2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO - train_loss = 0.21520239114761353
	2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Steps = 68800/278576
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - dev_loss = 0.130209 \|\| dev_eval_scores = {'perplexity': 1.1390659809112549}
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - train_loss = 0.21429920196533203
	2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:26:06,245 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Steps = 69200/278576
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - dev_loss = 0.131858 \|\| dev_eval_scores = {'perplexity': 1.1409462690353394}
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - train_loss = 0.2133215069770813
	2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:33:18,361 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:33:21,478 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=17367
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Epoch = 4/16
	2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Steps = 69600/278576
	2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - dev_loss = 0.127342 \|\| dev_eval_scores = {'perplexity': 1.135805368423462}
	2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO - train_loss = 0.21238122880458832
	2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:33:46,659 - crisis_transformers.trainer - INFO - epoch 4 ends, 12 epoches left
	2020-06-12 17:33:46,661 - crisis_transformers.trainer - INFO -
	global_average_loss=0.8768848776817322,global_steps=69644 on training set
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Steps = 70000/278576
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - dev_loss = 0.129323 \|\| dev_eval_scores = {'perplexity': 1.138058066368103}
	2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO - train_loss = 0.16184912621974945
	2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Steps = 70400/278576
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - dev_loss = 0.127871 \|\| dev_eval_scores = {'perplexity': 1.1364060640335083}
	2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - train_loss = 0.16288244724273682
	2020-06-12 17:47:45,700 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 17:54:57,738 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:55:00,860 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1156
	2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1350326538085938
	2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Steps = 70800/278576
	2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - dev_loss = 0.126661 \|\| dev_eval_scores = {'perplexity': 1.1350326538085938}
	2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO - train_loss = 0.16178201138973236
	2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:02:13,264 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:02:16,346 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1556
	2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1342060565948486
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - Steps = 71200/278576
	2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - dev_loss = 0.125933 \|\| dev_eval_scores = {'perplexity': 1.1342060565948486}
	2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO - train_loss = 0.16210666298866272
	2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:09:28,443 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:09:31,521 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1956
	2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.133441686630249
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Steps = 71600/278576
	2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:09:31,539 - crisis_transformers.trainer - INFO - dev_loss = 0.125259 \|\| dev_eval_scores = {'perplexity': 1.133441686630249}
	2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO - train_loss = 0.16150160133838654
	2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:16:44,613 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:16:47,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=2356
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1324461698532104
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Steps = 72000/278576
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - dev_loss = 0.124380 \|\| dev_eval_scores = {'perplexity': 1.1324461698532104}
	2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - train_loss = 0.160808727145195
	2020-06-12 18:16:47,598 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:23:59,557 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:24:02,688 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=2756
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1319714784622192
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Steps = 72400/278576
	2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - dev_loss = 0.123961 \|\| dev_eval_scores = {'perplexity': 1.1319714784622192}
	2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO - train_loss = 0.16037367284297943
	2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:31:14,824 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:31:17,933 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=3156
	2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.131042718887329
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Steps = 72800/278576
	2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:31:17,951 - crisis_transformers.trainer - INFO - dev_loss = 0.123140 \|\| dev_eval_scores = {'perplexity': 1.131042718887329}
	2020-06-12 18:31:17,969 - crisis_transformers.trainer - INFO - train_loss = 0.16030311584472656
	2020-06-12 18:31:17,970 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:38:30,048 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=3556
	2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - Steps = 73200/278576
	2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - dev_loss = 0.122653 \|\| dev_eval_scores = {'perplexity': 1.130492091178894}
	2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - train_loss = 0.16008983552455902
	2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Steps = 73600/278576
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - dev_loss = 0.126517 \|\| dev_eval_scores = {'perplexity': 1.1348693370819092}
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - train_loss = 0.15986071527004242
	2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Steps = 74000/278576
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - dev_loss = 0.123426 \|\| dev_eval_scores = {'perplexity': 1.1313656568527222}
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - train_loss = 0.15936923027038574
	2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:00:08,504 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:00:11,867 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=4756
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1283477544784546
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Steps = 74400/278576
	2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - dev_loss = 0.120754 \|\| dev_eval_scores = {'perplexity': 1.1283477544784546}
	2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO - train_loss = 0.15900550782680511
	2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1283477544784546
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Steps = 74800/278576
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - dev_loss = 0.122012 \|\| dev_eval_scores = {'perplexity': 1.1297677755355835}
	2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO - train_loss = 0.15857523679733276
	2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:14:35,509 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:14:38,680 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=5556
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1280550956726074
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Steps = 75200/278576
	2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO - dev_loss = 0.120495 \|\| dev_eval_scores = {'perplexity': 1.1280550956726074}
	2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO - train_loss = 0.15801657736301422
	2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1280550956726074
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Steps = 75600/278576
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - dev_loss = 0.120551 \|\| dev_eval_scores = {'perplexity': 1.1281187534332275}
	2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO - train_loss = 0.15759296715259552
	2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:29:02,897 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:29:06,248 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=6356
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.126711368560791
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Steps = 76000/278576
	2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO - dev_loss = 0.119303 \|\| dev_eval_scores = {'perplexity': 1.126711368560791}
	2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO - train_loss = 0.15713545680046082
	2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:36:18,058 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:36:21,263 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=6756
	2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1253490447998047
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Steps = 76400/278576
	2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - dev_loss = 0.118093 \|\| dev_eval_scores = {'perplexity': 1.1253490447998047}
	2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO - train_loss = 0.15685245394706726
	2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:43:33,123 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:43:36,558 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=7156
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1248881816864014
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Steps = 76800/278576
	2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - dev_loss = 0.117684 \|\| dev_eval_scores = {'perplexity': 1.1248881816864014}
	2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO - train_loss = 0.15634950995445251
	2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1248881816864014
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Steps = 77200/278576
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - dev_loss = 0.117770 \|\| dev_eval_scores = {'perplexity': 1.1249850988388062}
	2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO - train_loss = 0.1558857262134552
	2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 19:58:00,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:58:03,711 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=7956
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1242866516113281
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Steps = 77600/278576
	2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - dev_loss = 0.117149 \|\| dev_eval_scores = {'perplexity': 1.1242866516113281}
	2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO - train_loss = 0.15545228123664856
	2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:05:15,952 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:05:19,295 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=8356
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1240224838256836
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Steps = 78000/278576
	2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - dev_loss = 0.116914 \|\| dev_eval_scores = {'perplexity': 1.1240224838256836}
	2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO - train_loss = 0.15514682233333588
	2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:12:30,593 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:12:33,718 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=8756
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1232761144638062
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Steps = 78400/278576
	2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - dev_loss = 0.116250 \|\| dev_eval_scores = {'perplexity': 1.1232761144638062}
	2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO - train_loss = 0.15461046993732452
	2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:19:45,605 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1232761144638062
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Steps = 78800/278576
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - dev_loss = 0.117339 \|\| dev_eval_scores = {'perplexity': 1.12450110912323}
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - train_loss = 0.15414145588874817
	2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:26:57,592 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:27:00,603 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=9556
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1222569942474365
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Steps = 79200/278576
	2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - dev_loss = 0.115342 \|\| dev_eval_scores = {'perplexity': 1.1222569942474365}
	2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO - train_loss = 0.15376870334148407
	2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:34:13,086 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=9956
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Steps = 79600/278576
	2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - dev_loss = 0.114636 \|\| dev_eval_scores = {'perplexity': 1.1214654445648193}
	2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO - train_loss = 0.1534395068883896
	2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Steps = 80000/278576
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - dev_loss = 0.114919 \|\| dev_eval_scores = {'perplexity': 1.1217821836471558}
	2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - train_loss = 0.15301528573036194
	2020-06-12 20:41:28,721 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:48:40,263 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Steps = 80400/278576
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - dev_loss = 0.116439 \|\| dev_eval_scores = {'perplexity': 1.123489499092102}
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - train_loss = 0.1526806503534317
	2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 20:55:51,280 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:55:54,387 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=11156
	2020-06-12 20:55:54,402 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1208057403564453
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Steps = 80800/278576
	2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - dev_loss = 0.114048 \|\| dev_eval_scores = {'perplexity': 1.1208057403564453}
	2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO - train_loss = 0.15231111645698547
	2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1208057403564453
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - Steps = 81200/278576
	2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - dev_loss = 0.114643 \|\| dev_eval_scores = {'perplexity': 1.1214734315872192}
	2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - train_loss = 0.15194369852542877
	2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:10:17,767 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:10:20,890 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=11956
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1198288202285767
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Steps = 81600/278576
	2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - dev_loss = 0.113176 \|\| dev_eval_scores = {'perplexity': 1.1198288202285767}
	2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO - train_loss = 0.15155275166034698
	2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:17:33,024 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:17:36,142 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=12356
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Steps = 82000/278576
	2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - dev_loss = 0.112563 \|\| dev_eval_scores = {'perplexity': 1.1191425323486328}
	2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO - train_loss = 0.15120427310466766
	2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Steps = 82400/278576
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - dev_loss = 0.113121 \|\| dev_eval_scores = {'perplexity': 1.1197669506072998}
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - train_loss = 0.1508565992116928
	2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:32:00,891 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Early stop count = 2/20
	2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Steps = 82800/278576
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - dev_loss = 0.114633 \|\| dev_eval_scores = {'perplexity': 1.1214618682861328}
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - train_loss = 0.15052178502082825
	2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Early stop count = 3/20
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Steps = 83200/278576
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - dev_loss = 0.112780 \|\| dev_eval_scores = {'perplexity': 1.1193852424621582}
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - train_loss = 0.15016387403011322
	2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:46:23,357 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:46:26,472 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=13956
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1182266473770142
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Steps = 83600/278576
	2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - dev_loss = 0.111744 \|\| dev_eval_scores = {'perplexity': 1.1182266473770142}
	2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO - train_loss = 0.14978548884391785
	2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 21:53:38,893 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:53:42,051 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=14356
	2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1176480054855347
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Steps = 84000/278576
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - dev_loss = 0.111226 \|\| dev_eval_scores = {'perplexity': 1.1176480054855347}
	2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO - train_loss = 0.1494358777999878
	2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:00:52,990 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:00:56,106 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=14756
	2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1175894737243652
	2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Steps = 84400/278576
	2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - dev_loss = 0.111174 \|\| dev_eval_scores = {'perplexity': 1.1175894737243652}
	2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO - train_loss = 0.14908182621002197
	2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1175894737243652
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Steps = 84800/278576
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - dev_loss = 0.111978 \|\| dev_eval_scores = {'perplexity': 1.1184886693954468}
	2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO - train_loss = 0.1487855762243271
	2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:15:29,618 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:15:33,237 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=15556
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1161783933639526
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 23s
	2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Steps = 85200/278576
	2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - dev_loss = 0.109911 \|\| dev_eval_scores = {'perplexity': 1.1161783933639526}
	2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO - train_loss = 0.14848528802394867
	2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:22:49,305 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Early stop count = 1/20
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1161783933639526
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Steps = 85600/278576
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - dev_loss = 0.110636 \|\| dev_eval_scores = {'perplexity': 1.1169886589050293}
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - train_loss = 0.14812599122524261
	2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:30:01,038 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:30:04,724 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=16356
	2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1152490377426147
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Steps = 86000/278576
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - dev_loss = 0.109078 \|\| dev_eval_scores = {'perplexity': 1.1152490377426147}
	2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO - train_loss = 0.14784550666809082
	2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO -
	********************************************
	2020-06-12 22:37:16,018 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=16756
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - *** Evaluation report ***
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Early stop on: perplexity
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Early stop count = 0/20
	2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400)
	2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1139332056045532
	2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1
	2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642
	2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Epoch = 5/16
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Steps = 86400/278576
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - dev_loss = 0.107897 \|\| dev_eval_scores = {'perplexity': 1.1139332056045532}
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - train_loss = 0.14752434194087982
	2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -
	********************************************