Junyi42 commited on
Commit
9da4252
·
verified ·
1 Parent(s): 9f328a9

Upload checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins

Browse files
checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/wandb/offline-run-20260125_170309-vlm_gym_colorization_one_img_lr2e_5_mse_only_ins-run0/files/output.log CHANGED
@@ -795,49 +795,6 @@ wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/
795
  [2026-01-25 21:54:23] (step=0000784) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
796
  [2026-01-25 21:54:50] (step=0000785) Train Loss mse: 0.0068, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
797
  [2026-01-25 21:55:12] (step=0000786) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
798
- [2026-01-25 21:55:36] (step=0000787) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
799
- [2026-01-25 21:55:56] (step=0000788) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
800
- [2026-01-25 21:56:14] (step=0000789) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
801
- [2026-01-25 21:56:38] (step=0000790) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
802
- [2026-01-25 21:57:00] (step=0000791) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
803
- [2026-01-25 21:57:17] (step=0000792) Train Loss mse: 0.0088, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
804
- [2026-01-25 21:57:38] (step=0000793) Train Loss mse: 0.0084, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
805
- [2026-01-25 21:58:01] (step=0000794) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
806
- [2026-01-25 21:58:19] (step=0000795) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
807
- [2026-01-25 21:58:41] (step=0000796) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
808
- [2026-01-25 21:59:01] (step=0000797) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
809
- [2026-01-25 21:59:22] (step=0000798) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
810
- [2026-01-25 21:59:44] (step=0000799) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
811
- [2026-01-25 22:00:06] (step=0000800) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
812
- [2026-01-25 22:00:28] (step=0000801) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
813
- [2026-01-25 22:00:50] (step=0000802) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
814
- [2026-01-25 22:01:06] (step=0000803) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
815
- [2026-01-25 22:01:27] (step=0000804) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
816
- [2026-01-25 22:01:52] (step=0000805) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
817
- [2026-01-25 22:02:10] (step=0000806) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
818
- [2026-01-25 22:02:32] (step=0000807) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
819
- [2026-01-25 22:02:52] (step=0000808) Train Loss mse: 0.0092, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
820
- [2026-01-25 22:03:10] (step=0000809) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
821
- [2026-01-25 22:03:31] (step=0000810) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
822
- [2026-01-25 22:03:50] (step=0000811) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
823
- [2026-01-25 22:04:17] (step=0000812) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
824
- [2026-01-25 22:04:37] (step=0000813) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
825
- [2026-01-25 22:05:01] (step=0000814) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
826
- [2026-01-25 22:05:22] (step=0000815) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
827
- [2026-01-25 22:05:42] (step=0000816) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
828
- [2026-01-25 22:06:06] (step=0000817) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
829
- [2026-01-25 22:06:27] (step=0000818) Train Loss mse: 0.0091, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
830
- [2026-01-25 22:06:48] (step=0000819) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
831
- [2026-01-25 22:07:09] (step=0000820) Train Loss mse: 0.0086, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
832
- [2026-01-25 22:07:31] (step=0000821) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
833
- [2026-01-25 22:07:49] (step=0000822) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
834
- [2026-01-25 22:08:12] (step=0000823) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
835
- [2026-01-25 22:08:30] (step=0000824) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
836
- [2026-01-25 22:08:52] (step=0000825) Train Loss mse: 0.0099, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
837
- [2026-01-25 22:09:13] (step=0000826) Train Loss mse: 0.0085, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
838
- [2026-01-25 22:09:34] (step=0000827) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
839
- [2026-01-25 22:09:58] (step=0000828) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
840
- [2026-01-25 22:10:18] (step=0000829) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
841
  FullyShardedDataParallel(
842
  (_fsdp_wrapped_module): Bagel(
843
  (language_model): Qwen2ForCausalLM(
@@ -1024,20 +981,49 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
1024
  fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1025
  fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1026
  ce_avg: 0.0, mse_avg: 0.007997258566319942
1027
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1000
1028
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
1029
- [eval debug] first 3 batch fingerprints:
1030
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1031
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1032
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1033
- ce_avg: 0.0, mse_avg: 0.007652191445231438
1034
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1500
1035
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
1036
- [eval debug] first 3 batch fingerprints:
1037
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1038
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1039
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
1040
- ce_avg: 0.0, mse_avg: 0.00800316222012043
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1041
  [2026-01-25 22:10:41] (step=0000830) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1042
  [2026-01-25 22:10:58] (step=0000831) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
1043
  [2026-01-25 22:11:18] (step=0000832) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
@@ -2053,6 +2039,20 @@ ce_avg: 0.0, mse_avg: 0.00800316222012043
2053
  [2026-01-26 04:13:26] (step=0001842) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2054
  [2026-01-26 04:13:48] (step=0001843) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2055
  [2026-01-26 04:14:13] (step=0001844) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2056
  [2026-01-26 04:14:36] (step=0001845) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2057
  [2026-01-26 04:14:57] (step=0001846) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2058
  [2026-01-26 04:15:20] (step=0001847) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
@@ -2136,20 +2136,6 @@ ce_avg: 0.0, mse_avg: 0.00800316222012043
2136
  [2026-01-26 04:42:39] (step=0001925) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2137
  [2026-01-26 04:43:01] (step=0001926) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2138
  [2026-01-26 04:43:23] (step=0001927) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2139
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2000
2140
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
2141
- [eval debug] first 3 batch fingerprints:
2142
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2143
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2144
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2145
- ce_avg: 0.0, mse_avg: 0.0081106498837471
2146
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2500
2147
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
2148
- [eval debug] first 3 batch fingerprints:
2149
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2150
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2151
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2152
- ce_avg: 0.0, mse_avg: 0.007652428932487965
2153
  [2026-01-26 04:43:45] (step=0001928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2154
  [2026-01-26 04:44:06] (step=0001929) Train Loss mse: 0.0069, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2155
  [2026-01-26 04:44:28] (step=0001930) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
@@ -3082,6 +3068,20 @@ ce_avg: 0.0, mse_avg: 0.007652428932487965
3082
  [2026-01-26 10:17:47] (step=0002857) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3083
  [2026-01-26 10:18:07] (step=0002858) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3084
  [2026-01-26 10:18:28] (step=0002859) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3085
  [2026-01-26 10:18:49] (step=0002860) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3086
  [2026-01-26 10:19:12] (step=0002861) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3087
  [2026-01-26 10:19:31] (step=0002862) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
@@ -3152,27 +3152,6 @@ ce_avg: 0.0, mse_avg: 0.007652428932487965
3152
  [2026-01-26 10:42:06] (step=0002927) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3153
  [2026-01-26 10:42:29] (step=0002928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3154
  [2026-01-26 10:42:50] (step=0002929) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3155
- [2026-01-26 10:43:13
3156
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3000
3157
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3158
- [eval debug] first 3 batch fingerprints:
3159
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3160
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3161
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3162
- ce_avg: 0.0, mse_avg: 0.007834003306925297
3163
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3500
3164
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3165
- [eval debug] first 3 batch fingerprints:
3166
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3167
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3168
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3169
- ce_avg: 0.0, mse_avg: 0.007766008842736483
3170
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4000
3171
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3172
- [eval debug] first 3 batch fingerprints:
3173
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3174
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3175
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3176
  [2026-01-26 10:43:13] (step=0002930) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3177
  [2026-01-26 10:43:33] (step=0002931) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3178
  [2026-01-26 10:43:54] (step=0002932) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
@@ -3988,6 +3967,34 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
3988
  [2026-01-26 15:35:43] (step=0003742) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3989
  [2026-01-26 15:36:04] (step=0003743) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3990
  [2026-01-26 15:36:28] (step=0003744) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3991
  [2026-01-26 15:36:52] (step=0003745) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3992
  [2026-01-26 15:37:15] (step=0003746) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3993
  [2026-01-26 15:37:35] (step=0003747) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
@@ -4179,20 +4186,6 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
4179
  [2026-01-26 16:44:11] (step=0003933) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
4180
  [2026-01-26 16:44:29] (step=0003934) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
4181
  [2026-01-26 16:44:52] (step=0003935) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
4182
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4500
4183
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
4184
- [eval debug] first 3 batch fingerprints:
4185
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4186
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4187
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4188
- ce_avg: 0.0, mse_avg: 0.007897508330643177
4189
- base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step5000
4190
- Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
4191
- [eval debug] first 3 batch fingerprints:
4192
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4193
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4194
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
4195
- ce_avg: 0.0, mse_avg: 0.007832281291484833
4196
  [2026-01-26 16:45:14] (step=0003936) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
4197
  [2026-01-26 16:45:36] (step=0003937) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
4198
  [2026-01-26 16:45:55] (step=0003938) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
@@ -5098,6 +5091,13 @@ ce_avg: 0.0, mse_avg: 0.007832281291484833
5098
  [2026-01-26 22:10:09] (step=0004838) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
5099
  [2026-01-26 22:10:34] (step=0004839) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
5100
  [2026-01-26 22:10:59] (step=0004840) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
 
 
 
 
 
 
5101
  [2026-01-26 22:11:23] (step=0004841) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
5102
  [2026-01-26 22:11:42] (step=0004842) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
5103
  [2026-01-26 22:12:06] (step=0004843) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
795
  [2026-01-25 21:54:23] (step=0000784) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
796
  [2026-01-25 21:54:50] (step=0000785) Train Loss mse: 0.0068, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
797
  [2026-01-25 21:55:12] (step=0000786) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
798
  FullyShardedDataParallel(
799
  (_fsdp_wrapped_module): Bagel(
800
  (language_model): Qwen2ForCausalLM(
 
981
  fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
982
  fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
983
  ce_avg: 0.0, mse_avg: 0.007997258566319942
984
+ [2026-01-25 21:55:36] (step=0000787) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
985
+ [2026-01-25 21:55:56] (step=0000788) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
986
+ [2026-01-25 21:56:14] (step=0000789) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
987
+ [2026-01-25 21:56:38] (step=0000790) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
988
+ [2026-01-25 21:57:00] (step=0000791) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
989
+ [2026-01-25 21:57:17] (step=0000792) Train Loss mse: 0.0088, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
990
+ [2026-01-25 21:57:38] (step=0000793) Train Loss mse: 0.0084, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
991
+ [2026-01-25 21:58:01] (step=0000794) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
992
+ [2026-01-25 21:58:19] (step=0000795) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
993
+ [2026-01-25 21:58:41] (step=0000796) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
994
+ [2026-01-25 21:59:01] (step=0000797) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
995
+ [2026-01-25 21:59:22] (step=0000798) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
996
+ [2026-01-25 21:59:44] (step=0000799) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
997
+ [2026-01-25 22:00:06] (step=0000800) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
998
+ [2026-01-25 22:00:28] (step=0000801) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
999
+ [2026-01-25 22:00:50] (step=0000802) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1000
+ [2026-01-25 22:01:06] (step=0000803) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
1001
+ [2026-01-25 22:01:27] (step=0000804) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1002
+ [2026-01-25 22:01:52] (step=0000805) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1003
+ [2026-01-25 22:02:10] (step=0000806) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1004
+ [2026-01-25 22:02:32] (step=0000807) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1005
+ [2026-01-25 22:02:52] (step=0000808) Train Loss mse: 0.0092, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1006
+ [2026-01-25 22:03:10] (step=0000809) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1007
+ [2026-01-25 22:03:31] (step=0000810) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1008
+ [2026-01-25 22:03:50] (step=0000811) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1009
+ [2026-01-25 22:04:17] (step=0000812) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1010
+ [2026-01-25 22:04:37] (step=0000813) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1011
+ [2026-01-25 22:05:01] (step=0000814) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1012
+ [2026-01-25 22:05:22] (step=0000815) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1013
+ [2026-01-25 22:05:42] (step=0000816) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1014
+ [2026-01-25 22:06:06] (step=0000817) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1015
+ [2026-01-25 22:06:27] (step=0000818) Train Loss mse: 0.0091, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1016
+ [2026-01-25 22:06:48] (step=0000819) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1017
+ [2026-01-25 22:07:09] (step=0000820) Train Loss mse: 0.0086, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1018
+ [2026-01-25 22:07:31] (step=0000821) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1019
+ [2026-01-25 22:07:49] (step=0000822) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
1020
+ [2026-01-25 22:08:12] (step=0000823) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1021
+ [2026-01-25 22:08:30] (step=0000824) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
1022
+ [2026-01-25 22:08:52] (step=0000825) Train Loss mse: 0.0099, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1023
+ [2026-01-25 22:09:13] (step=0000826) Train Loss mse: 0.0085, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1024
+ [2026-01-25 22:09:34] (step=0000827) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1025
+ [2026-01-25 22:09:58] (step=0000828) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1026
+ [2026-01-25 22:10:18] (step=0000829) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
1027
  [2026-01-25 22:10:41] (step=0000830) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
1028
  [2026-01-25 22:10:58] (step=0000831) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
1029
  [2026-01-25 22:11:18] (step=0000832) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
2039
  [2026-01-26 04:13:26] (step=0001842) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2040
  [2026-01-26 04:13:48] (step=0001843) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2041
  [2026-01-26 04:14:13] (step=0001844) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2042
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1000
2043
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
2044
+ [eval debug] first 3 batch fingerprints:
2045
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2046
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2047
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2048
+ ce_avg: 0.0, mse_avg: 0.007652191445231438
2049
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1500
2050
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
2051
+ [eval debug] first 3 batch fingerprints:
2052
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2053
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2054
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
2055
+ ce_avg: 0.0, mse_avg: 0.00800316222012043
2056
  [2026-01-26 04:14:36] (step=0001845) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2057
  [2026-01-26 04:14:57] (step=0001846) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2058
  [2026-01-26 04:15:20] (step=0001847) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
2136
  [2026-01-26 04:42:39] (step=0001925) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2137
  [2026-01-26 04:43:01] (step=0001926) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
2138
  [2026-01-26 04:43:23] (step=0001927) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2139
  [2026-01-26 04:43:45] (step=0001928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2140
  [2026-01-26 04:44:06] (step=0001929) Train Loss mse: 0.0069, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
2141
  [2026-01-26 04:44:28] (step=0001930) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
3068
  [2026-01-26 10:17:47] (step=0002857) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3069
  [2026-01-26 10:18:07] (step=0002858) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3070
  [2026-01-26 10:18:28] (step=0002859) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3071
+ [2026-01-26 10:18:49
3072
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2000
3073
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3074
+ [eval debug] first 3 batch fingerprints:
3075
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3076
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3077
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3078
+ ce_avg: 0.0, mse_avg: 0.0081106498837471
3079
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2500
3080
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3081
+ [eval debug] first 3 batch fingerprints:
3082
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3083
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3084
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3085
  [2026-01-26 10:18:49] (step=0002860) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3086
  [2026-01-26 10:19:12] (step=0002861) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3087
  [2026-01-26 10:19:31] (step=0002862) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
3152
  [2026-01-26 10:42:06] (step=0002927) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3153
  [2026-01-26 10:42:29] (step=0002928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3154
  [2026-01-26 10:42:50] (step=0002929) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3155
  [2026-01-26 10:43:13] (step=0002930) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3156
  [2026-01-26 10:43:33] (step=0002931) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3157
  [2026-01-26 10:43:54] (step=0002932) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
3967
  [2026-01-26 15:35:43] (step=0003742) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3968
  [2026-01-26 15:36:04] (step=0003743) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
3969
  [2026-01-26 15:36:28] (step=0003744) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3970
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3000
3971
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3972
+ [eval debug] first 3 batch fingerprints:
3973
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3974
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3975
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3976
+ ce_avg: 0.0, mse_avg: 0.007834003306925297
3977
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3500
3978
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3979
+ [eval debug] first 3 batch fingerprints:
3980
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3981
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3982
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3983
+ ce_avg: 0.0, mse_avg: 0.007766008842736483
3984
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4000
3985
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3986
+ [eval debug] first 3 batch fingerprints:
3987
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3988
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3989
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3990
+ ce_avg: 0.0, mse_avg: 0.007558991201221943
3991
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4500
3992
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
3993
+ [eval debug] first 3 batch fingerprints:
3994
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3995
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3996
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
3997
+ ce_avg: 0.0, mse_avg: 0.007897508330643177
3998
  [2026-01-26 15:36:52] (step=0003745) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
3999
  [2026-01-26 15:37:15] (step=0003746) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
4000
  [2026-01-26 15:37:35] (step=0003747) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
4186
  [2026-01-26 16:44:11] (step=0003933) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
4187
  [2026-01-26 16:44:29] (step=0003934) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
4188
  [2026-01-26 16:44:52] (step=0003935) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4189
  [2026-01-26 16:45:14] (step=0003936) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
4190
  [2026-01-26 16:45:36] (step=0003937) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
4191
  [2026-01-26 16:45:55] (step=0003938) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
 
5091
  [2026-01-26 22:10:09] (step=0004838) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
5092
  [2026-01-26 22:10:34] (step=0004839) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
5093
  [2026-01-26 22:10:59] (step=0004840) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
5094
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step5000
5095
+ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
5096
+ [eval debug] first 3 batch fingerprints:
5097
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
5098
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
5099
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
5100
+ ce_avg: 0.0, mse_avg: 0.007832281291484833
5101
  [2026-01-26 22:11:23] (step=0004841) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
5102
  [2026-01-26 22:11:42] (step=0004842) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
5103
  [2026-01-26 22:12:06] (step=0004843) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,