Upload checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins
Browse files
checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/wandb/offline-run-20260125_170309-vlm_gym_colorization_one_img_lr2e_5_mse_only_ins-run0/files/output.log
CHANGED
|
@@ -795,49 +795,6 @@ wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/
|
|
| 795 |
[[34m2026-01-25 21:54:23[39m] (step=0000784) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 796 |
[[34m2026-01-25 21:54:50[39m] (step=0000785) Train Loss mse: 0.0068, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 797 |
[[34m2026-01-25 21:55:12[39m] (step=0000786) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 798 |
-
[[34m2026-01-25 21:55:36[39m] (step=0000787) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 799 |
-
[[34m2026-01-25 21:55:56[39m] (step=0000788) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 800 |
-
[[34m2026-01-25 21:56:14[39m] (step=0000789) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 801 |
-
[[34m2026-01-25 21:56:38[39m] (step=0000790) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 802 |
-
[[34m2026-01-25 21:57:00[39m] (step=0000791) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 803 |
-
[[34m2026-01-25 21:57:17[39m] (step=0000792) Train Loss mse: 0.0088, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 804 |
-
[[34m2026-01-25 21:57:38[39m] (step=0000793) Train Loss mse: 0.0084, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 805 |
-
[[34m2026-01-25 21:58:01[39m] (step=0000794) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 806 |
-
[[34m2026-01-25 21:58:19[39m] (step=0000795) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 807 |
-
[[34m2026-01-25 21:58:41[39m] (step=0000796) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 808 |
-
[[34m2026-01-25 21:59:01[39m] (step=0000797) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 809 |
-
[[34m2026-01-25 21:59:22[39m] (step=0000798) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 810 |
-
[[34m2026-01-25 21:59:44[39m] (step=0000799) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 811 |
-
[[34m2026-01-25 22:00:06[39m] (step=0000800) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 812 |
-
[[34m2026-01-25 22:00:28[39m] (step=0000801) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 813 |
-
[[34m2026-01-25 22:00:50[39m] (step=0000802) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 814 |
-
[[34m2026-01-25 22:01:06[39m] (step=0000803) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 815 |
-
[[34m2026-01-25 22:01:27[39m] (step=0000804) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 816 |
-
[[34m2026-01-25 22:01:52[39m] (step=0000805) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 817 |
-
[[34m2026-01-25 22:02:10[39m] (step=0000806) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 818 |
-
[[34m2026-01-25 22:02:32[39m] (step=0000807) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 819 |
-
[[34m2026-01-25 22:02:52[39m] (step=0000808) Train Loss mse: 0.0092, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 820 |
-
[[34m2026-01-25 22:03:10[39m] (step=0000809) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 821 |
-
[[34m2026-01-25 22:03:31[39m] (step=0000810) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 822 |
-
[[34m2026-01-25 22:03:50[39m] (step=0000811) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 823 |
-
[[34m2026-01-25 22:04:17[39m] (step=0000812) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 824 |
-
[[34m2026-01-25 22:04:37[39m] (step=0000813) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 825 |
-
[[34m2026-01-25 22:05:01[39m] (step=0000814) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 826 |
-
[[34m2026-01-25 22:05:22[39m] (step=0000815) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 827 |
-
[[34m2026-01-25 22:05:42[39m] (step=0000816) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 828 |
-
[[34m2026-01-25 22:06:06[39m] (step=0000817) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 829 |
-
[[34m2026-01-25 22:06:27[39m] (step=0000818) Train Loss mse: 0.0091, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 830 |
-
[[34m2026-01-25 22:06:48[39m] (step=0000819) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 831 |
-
[[34m2026-01-25 22:07:09[39m] (step=0000820) Train Loss mse: 0.0086, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 832 |
-
[[34m2026-01-25 22:07:31[39m] (step=0000821) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 833 |
-
[[34m2026-01-25 22:07:49[39m] (step=0000822) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 834 |
-
[[34m2026-01-25 22:08:12[39m] (step=0000823) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 835 |
-
[[34m2026-01-25 22:08:30[39m] (step=0000824) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 836 |
-
[[34m2026-01-25 22:08:52[39m] (step=0000825) Train Loss mse: 0.0099, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 837 |
-
[[34m2026-01-25 22:09:13[39m] (step=0000826) Train Loss mse: 0.0085, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 838 |
-
[[34m2026-01-25 22:09:34[39m] (step=0000827) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 839 |
-
[[34m2026-01-25 22:09:58[39m] (step=0000828) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 840 |
-
[[34m2026-01-25 22:10:18[39m] (step=0000829) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 841 |
FullyShardedDataParallel(
|
| 842 |
(_fsdp_wrapped_module): Bagel(
|
| 843 |
(language_model): Qwen2ForCausalLM(
|
|
@@ -1024,20 +981,49 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
|
|
| 1024 |
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 1025 |
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 1026 |
ce_avg: 0.0, mse_avg: 0.007997258566319942
|
| 1027 |
-
|
| 1028 |
-
|
| 1029 |
-
[
|
| 1030 |
-
|
| 1031 |
-
|
| 1032 |
-
|
| 1033 |
-
|
| 1034 |
-
|
| 1035 |
-
|
| 1036 |
-
[
|
| 1037 |
-
|
| 1038 |
-
|
| 1039 |
-
|
| 1040 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1041 |
[[34m2026-01-25 22:10:41[39m] (step=0000830) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1042 |
[[34m2026-01-25 22:10:58[39m] (step=0000831) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1043 |
[[34m2026-01-25 22:11:18[39m] (step=0000832) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
@@ -2053,6 +2039,20 @@ ce_avg: 0.0, mse_avg: 0.00800316222012043
|
|
| 2053 |
[[34m2026-01-26 04:13:26[39m] (step=0001842) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2054 |
[[34m2026-01-26 04:13:48[39m] (step=0001843) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2055 |
[[34m2026-01-26 04:14:13[39m] (step=0001844) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2056 |
[[34m2026-01-26 04:14:36[39m] (step=0001845) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2057 |
[[34m2026-01-26 04:14:57[39m] (step=0001846) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2058 |
[[34m2026-01-26 04:15:20[39m] (step=0001847) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
@@ -2136,20 +2136,6 @@ ce_avg: 0.0, mse_avg: 0.00800316222012043
|
|
| 2136 |
[[34m2026-01-26 04:42:39[39m] (step=0001925) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2137 |
[[34m2026-01-26 04:43:01[39m] (step=0001926) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2138 |
[[34m2026-01-26 04:43:23[39m] (step=0001927) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2139 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2000
|
| 2140 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 2141 |
-
[eval debug] first 3 batch fingerprints:
|
| 2142 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2143 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2144 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2145 |
-
ce_avg: 0.0, mse_avg: 0.0081106498837471
|
| 2146 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2500
|
| 2147 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 2148 |
-
[eval debug] first 3 batch fingerprints:
|
| 2149 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2150 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2151 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2152 |
-
ce_avg: 0.0, mse_avg: 0.007652428932487965
|
| 2153 |
[[34m2026-01-26 04:43:45[39m] (step=0001928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2154 |
[[34m2026-01-26 04:44:06[39m] (step=0001929) Train Loss mse: 0.0069, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2155 |
[[34m2026-01-26 04:44:28[39m] (step=0001930) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
@@ -3082,6 +3068,20 @@ ce_avg: 0.0, mse_avg: 0.007652428932487965
|
|
| 3082 |
[[34m2026-01-26 10:17:47[39m] (step=0002857) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3083 |
[[34m2026-01-26 10:18:07[39m] (step=0002858) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3084 |
[[34m2026-01-26 10:18:28[39m] (step=0002859) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3085 |
[[34m2026-01-26 10:18:49[39m] (step=0002860) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3086 |
[[34m2026-01-26 10:19:12[39m] (step=0002861) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3087 |
[[34m2026-01-26 10:19:31[39m] (step=0002862) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
@@ -3152,27 +3152,6 @@ ce_avg: 0.0, mse_avg: 0.007652428932487965
|
|
| 3152 |
[[34m2026-01-26 10:42:06[39m] (step=0002927) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3153 |
[[34m2026-01-26 10:42:29[39m] (step=0002928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3154 |
[[34m2026-01-26 10:42:50[39m] (step=0002929) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3155 |
-
[[34m2026-01-26 10:43:13
|
| 3156 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3000
|
| 3157 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3158 |
-
[eval debug] first 3 batch fingerprints:
|
| 3159 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3160 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3161 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3162 |
-
ce_avg: 0.0, mse_avg: 0.007834003306925297
|
| 3163 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3500
|
| 3164 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3165 |
-
[eval debug] first 3 batch fingerprints:
|
| 3166 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3167 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3168 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3169 |
-
ce_avg: 0.0, mse_avg: 0.007766008842736483
|
| 3170 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4000
|
| 3171 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3172 |
-
[eval debug] first 3 batch fingerprints:
|
| 3173 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3174 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3175 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3176 |
[[34m2026-01-26 10:43:13[39m] (step=0002930) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3177 |
[[34m2026-01-26 10:43:33[39m] (step=0002931) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3178 |
[[34m2026-01-26 10:43:54[39m] (step=0002932) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
@@ -3988,6 +3967,34 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
|
|
| 3988 |
[[34m2026-01-26 15:35:43[39m] (step=0003742) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3989 |
[[34m2026-01-26 15:36:04[39m] (step=0003743) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3990 |
[[34m2026-01-26 15:36:28[39m] (step=0003744) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3991 |
[[34m2026-01-26 15:36:52[39m] (step=0003745) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3992 |
[[34m2026-01-26 15:37:15[39m] (step=0003746) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3993 |
[[34m2026-01-26 15:37:35[39m] (step=0003747) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
@@ -4179,20 +4186,6 @@ Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorizati
|
|
| 4179 |
[[34m2026-01-26 16:44:11[39m] (step=0003933) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 4180 |
[[34m2026-01-26 16:44:29[39m] (step=0003934) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4181 |
[[34m2026-01-26 16:44:52[39m] (step=0003935) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 4182 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4500
|
| 4183 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 4184 |
-
[eval debug] first 3 batch fingerprints:
|
| 4185 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4186 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4187 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4188 |
-
ce_avg: 0.0, mse_avg: 0.007897508330643177
|
| 4189 |
-
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step5000
|
| 4190 |
-
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 4191 |
-
[eval debug] first 3 batch fingerprints:
|
| 4192 |
-
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4193 |
-
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4194 |
-
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 4195 |
-
ce_avg: 0.0, mse_avg: 0.007832281291484833
|
| 4196 |
[[34m2026-01-26 16:45:14[39m] (step=0003936) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 4197 |
[[34m2026-01-26 16:45:36[39m] (step=0003937) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 4198 |
[[34m2026-01-26 16:45:55[39m] (step=0003938) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
@@ -5098,6 +5091,13 @@ ce_avg: 0.0, mse_avg: 0.007832281291484833
|
|
| 5098 |
[[34m2026-01-26 22:10:09[39m] (step=0004838) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 5099 |
[[34m2026-01-26 22:10:34[39m] (step=0004839) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 5100 |
[[34m2026-01-26 22:10:59[39m] (step=0004840) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5101 |
[[34m2026-01-26 22:11:23[39m] (step=0004841) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 5102 |
[[34m2026-01-26 22:11:42[39m] (step=0004842) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 5103 |
[[34m2026-01-26 22:12:06[39m] (step=0004843) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
| 795 |
[[34m2026-01-25 21:54:23[39m] (step=0000784) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 796 |
[[34m2026-01-25 21:54:50[39m] (step=0000785) Train Loss mse: 0.0068, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 797 |
[[34m2026-01-25 21:55:12[39m] (step=0000786) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 798 |
FullyShardedDataParallel(
|
| 799 |
(_fsdp_wrapped_module): Bagel(
|
| 800 |
(language_model): Qwen2ForCausalLM(
|
|
|
|
| 981 |
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 982 |
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 983 |
ce_avg: 0.0, mse_avg: 0.007997258566319942
|
| 984 |
+
[[34m2026-01-25 21:55:36[39m] (step=0000787) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 985 |
+
[[34m2026-01-25 21:55:56[39m] (step=0000788) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 986 |
+
[[34m2026-01-25 21:56:14[39m] (step=0000789) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 987 |
+
[[34m2026-01-25 21:56:38[39m] (step=0000790) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 988 |
+
[[34m2026-01-25 21:57:00[39m] (step=0000791) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 989 |
+
[[34m2026-01-25 21:57:17[39m] (step=0000792) Train Loss mse: 0.0088, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 990 |
+
[[34m2026-01-25 21:57:38[39m] (step=0000793) Train Loss mse: 0.0084, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 991 |
+
[[34m2026-01-25 21:58:01[39m] (step=0000794) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 992 |
+
[[34m2026-01-25 21:58:19[39m] (step=0000795) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 993 |
+
[[34m2026-01-25 21:58:41[39m] (step=0000796) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 994 |
+
[[34m2026-01-25 21:59:01[39m] (step=0000797) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 995 |
+
[[34m2026-01-25 21:59:22[39m] (step=0000798) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 996 |
+
[[34m2026-01-25 21:59:44[39m] (step=0000799) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 997 |
+
[[34m2026-01-25 22:00:06[39m] (step=0000800) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 998 |
+
[[34m2026-01-25 22:00:28[39m] (step=0000801) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 999 |
+
[[34m2026-01-25 22:00:50[39m] (step=0000802) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1000 |
+
[[34m2026-01-25 22:01:06[39m] (step=0000803) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1001 |
+
[[34m2026-01-25 22:01:27[39m] (step=0000804) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1002 |
+
[[34m2026-01-25 22:01:52[39m] (step=0000805) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1003 |
+
[[34m2026-01-25 22:02:10[39m] (step=0000806) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1004 |
+
[[34m2026-01-25 22:02:32[39m] (step=0000807) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1005 |
+
[[34m2026-01-25 22:02:52[39m] (step=0000808) Train Loss mse: 0.0092, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1006 |
+
[[34m2026-01-25 22:03:10[39m] (step=0000809) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1007 |
+
[[34m2026-01-25 22:03:31[39m] (step=0000810) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1008 |
+
[[34m2026-01-25 22:03:50[39m] (step=0000811) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1009 |
+
[[34m2026-01-25 22:04:17[39m] (step=0000812) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1010 |
+
[[34m2026-01-25 22:04:37[39m] (step=0000813) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1011 |
+
[[34m2026-01-25 22:05:01[39m] (step=0000814) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1012 |
+
[[34m2026-01-25 22:05:22[39m] (step=0000815) Train Loss mse: 0.0078, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1013 |
+
[[34m2026-01-25 22:05:42[39m] (step=0000816) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1014 |
+
[[34m2026-01-25 22:06:06[39m] (step=0000817) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1015 |
+
[[34m2026-01-25 22:06:27[39m] (step=0000818) Train Loss mse: 0.0091, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1016 |
+
[[34m2026-01-25 22:06:48[39m] (step=0000819) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1017 |
+
[[34m2026-01-25 22:07:09[39m] (step=0000820) Train Loss mse: 0.0086, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1018 |
+
[[34m2026-01-25 22:07:31[39m] (step=0000821) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1019 |
+
[[34m2026-01-25 22:07:49[39m] (step=0000822) Train Loss mse: 0.0083, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1020 |
+
[[34m2026-01-25 22:08:12[39m] (step=0000823) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1021 |
+
[[34m2026-01-25 22:08:30[39m] (step=0000824) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1022 |
+
[[34m2026-01-25 22:08:52[39m] (step=0000825) Train Loss mse: 0.0099, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1023 |
+
[[34m2026-01-25 22:09:13[39m] (step=0000826) Train Loss mse: 0.0085, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1024 |
+
[[34m2026-01-25 22:09:34[39m] (step=0000827) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1025 |
+
[[34m2026-01-25 22:09:58[39m] (step=0000828) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1026 |
+
[[34m2026-01-25 22:10:18[39m] (step=0000829) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 1027 |
[[34m2026-01-25 22:10:41[39m] (step=0000830) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 1028 |
[[34m2026-01-25 22:10:58[39m] (step=0000831) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 1029 |
[[34m2026-01-25 22:11:18[39m] (step=0000832) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
| 2039 |
[[34m2026-01-26 04:13:26[39m] (step=0001842) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2040 |
[[34m2026-01-26 04:13:48[39m] (step=0001843) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2041 |
[[34m2026-01-26 04:14:13[39m] (step=0001844) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2042 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1000
|
| 2043 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 2044 |
+
[eval debug] first 3 batch fingerprints:
|
| 2045 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2046 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2047 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2048 |
+
ce_avg: 0.0, mse_avg: 0.007652191445231438
|
| 2049 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step1500
|
| 2050 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 2051 |
+
[eval debug] first 3 batch fingerprints:
|
| 2052 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2053 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2054 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 2055 |
+
ce_avg: 0.0, mse_avg: 0.00800316222012043
|
| 2056 |
[[34m2026-01-26 04:14:36[39m] (step=0001845) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2057 |
[[34m2026-01-26 04:14:57[39m] (step=0001846) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2058 |
[[34m2026-01-26 04:15:20[39m] (step=0001847) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
| 2136 |
[[34m2026-01-26 04:42:39[39m] (step=0001925) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2137 |
[[34m2026-01-26 04:43:01[39m] (step=0001926) Train Loss mse: 0.0072, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 2138 |
[[34m2026-01-26 04:43:23[39m] (step=0001927) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2139 |
[[34m2026-01-26 04:43:45[39m] (step=0001928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2140 |
[[34m2026-01-26 04:44:06[39m] (step=0001929) Train Loss mse: 0.0069, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 2141 |
[[34m2026-01-26 04:44:28[39m] (step=0001930) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
| 3068 |
[[34m2026-01-26 10:17:47[39m] (step=0002857) Train Loss mse: 0.0082, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3069 |
[[34m2026-01-26 10:18:07[39m] (step=0002858) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3070 |
[[34m2026-01-26 10:18:28[39m] (step=0002859) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3071 |
+
[[34m2026-01-26 10:18:49
|
| 3072 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2000
|
| 3073 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3074 |
+
[eval debug] first 3 batch fingerprints:
|
| 3075 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3076 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3077 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3078 |
+
ce_avg: 0.0, mse_avg: 0.0081106498837471
|
| 3079 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step2500
|
| 3080 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3081 |
+
[eval debug] first 3 batch fingerprints:
|
| 3082 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3083 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3084 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3085 |
[[34m2026-01-26 10:18:49[39m] (step=0002860) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3086 |
[[34m2026-01-26 10:19:12[39m] (step=0002861) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3087 |
[[34m2026-01-26 10:19:31[39m] (step=0002862) Train Loss mse: 0.0065, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
| 3152 |
[[34m2026-01-26 10:42:06[39m] (step=0002927) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3153 |
[[34m2026-01-26 10:42:29[39m] (step=0002928) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3154 |
[[34m2026-01-26 10:42:50[39m] (step=0002929) Train Loss mse: 0.0079, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3155 |
[[34m2026-01-26 10:43:13[39m] (step=0002930) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3156 |
[[34m2026-01-26 10:43:33[39m] (step=0002931) Train Loss mse: 0.0062, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3157 |
[[34m2026-01-26 10:43:54[39m] (step=0002932) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
| 3967 |
[[34m2026-01-26 15:35:43[39m] (step=0003742) Train Loss mse: 0.0077, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3968 |
[[34m2026-01-26 15:36:04[39m] (step=0003743) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 3969 |
[[34m2026-01-26 15:36:28[39m] (step=0003744) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3970 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3000
|
| 3971 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3972 |
+
[eval debug] first 3 batch fingerprints:
|
| 3973 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3974 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3975 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3976 |
+
ce_avg: 0.0, mse_avg: 0.007834003306925297
|
| 3977 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step3500
|
| 3978 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3979 |
+
[eval debug] first 3 batch fingerprints:
|
| 3980 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3981 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3982 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3983 |
+
ce_avg: 0.0, mse_avg: 0.007766008842736483
|
| 3984 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4000
|
| 3985 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3986 |
+
[eval debug] first 3 batch fingerprints:
|
| 3987 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3988 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3989 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3990 |
+
ce_avg: 0.0, mse_avg: 0.007558991201221943
|
| 3991 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step4500
|
| 3992 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 3993 |
+
[eval debug] first 3 batch fingerprints:
|
| 3994 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3995 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3996 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 3997 |
+
ce_avg: 0.0, mse_avg: 0.007897508330643177
|
| 3998 |
[[34m2026-01-26 15:36:52[39m] (step=0003745) Train Loss mse: 0.0076, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 3999 |
[[34m2026-01-26 15:37:15[39m] (step=0003746) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 4000 |
[[34m2026-01-26 15:37:35[39m] (step=0003747) Train Loss mse: 0.0071, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
| 4186 |
[[34m2026-01-26 16:44:11[39m] (step=0003933) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 4187 |
[[34m2026-01-26 16:44:29[39m] (step=0003934) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.06,
|
| 4188 |
[[34m2026-01-26 16:44:52[39m] (step=0003935) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4189 |
[[34m2026-01-26 16:45:14[39m] (step=0003936) Train Loss mse: 0.0067, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 4190 |
[[34m2026-01-26 16:45:36[39m] (step=0003937) Train Loss mse: 0.0056, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 4191 |
[[34m2026-01-26 16:45:55[39m] (step=0003938) Train Loss mse: 0.0073, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
|
|
|
| 5091 |
[[34m2026-01-26 22:10:09[39m] (step=0004838) Train Loss mse: 0.0080, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 5092 |
[[34m2026-01-26 22:10:34[39m] (step=0004839) Train Loss mse: 0.0075, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 5093 |
[[34m2026-01-26 22:10:59[39m] (step=0004840) Train Loss mse: 0.0074, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 5094 |
+
base_dir is /dev/shm/models/checkpoints_vlm_gym_colorization_one_image_lr2e_5_mse_only_ins/eval_used_rows, step_tag is vlm_gym_colorization_one_img_lr2e_5_mse_only_ins_step5000
|
| 5095 |
+
Preparing Dataset vlm_gym_colorization_mse_loss_only_evalonce/vlm_gym_colorization_val
|
| 5096 |
+
[eval debug] first 3 batch fingerprints:
|
| 5097 |
+
fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 5098 |
+
fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 5099 |
+
fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_colorization_mse_loss_only_evalonce'}]
|
| 5100 |
+
ce_avg: 0.0, mse_avg: 0.007832281291484833
|
| 5101 |
[[34m2026-01-26 22:11:23[39m] (step=0004841) Train Loss mse: 0.0063, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|
| 5102 |
[[34m2026-01-26 22:11:42[39m] (step=0004842) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.05,
|
| 5103 |
[[34m2026-01-26 22:12:06[39m] (step=0004843) Train Loss mse: 0.0070, Train Loss ce: 0.0000, Train Steps/Sec: 0.04,
|