Junyi42 commited on
Commit
1ac984d
·
verified ·
1 Parent(s): 89f213b

Upload checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins

Browse files
checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/wandb/offline-run-20260128_045524-checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins-run0/files/output.log CHANGED
@@ -1319,20 +1319,6 @@ wandb: For more information, check out the docs at: https://weave-docs.wandb.ai/
1319
  [2026-01-28 05:25:51] (step=0001131) Train Loss mse: 0.0000, Train Loss ce: 0.1913, Train Steps/Sec: 0.83,
1320
  [2026-01-28 05:25:52] (step=0001132) Train Loss mse: 0.0000, Train Loss ce: 0.2097, Train Steps/Sec: 0.69,
1321
  [2026-01-28 05:25:53] (step=0001133) Train Loss mse: 0.0000, Train Loss ce: 0.1769, Train Steps/Sec: 0.83,
1322
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step1500
1323
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
1324
- [eval debug] first 3 batch fingerprints:
1325
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1326
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1327
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1328
- ce_avg: 0.18267571926116943, mse_avg: 0.0
1329
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step2000
1330
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
1331
- [eval debug] first 3 batch fingerprints:
1332
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1333
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1334
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1335
- ce_avg: 0.1819312572479248, mse_avg: 0.0
1336
  [2026-01-28 05:25:55] (step=0001134) Train Loss mse: 0.0000, Train Loss ce: 0.1800, Train Steps/Sec: 0.83,
1337
  [2026-01-28 05:25:56] (step=0001135) Train Loss mse: 0.0000, Train Loss ce: 0.1738, Train Steps/Sec: 0.83,
1338
  [2026-01-28 05:25:57] (step=0001136) Train Loss mse: 0.0000, Train Loss ce: 0.2190, Train Steps/Sec: 0.69,
@@ -1366,6 +1352,20 @@ ce_avg: 0.1819312572479248, mse_avg: 0.0
1366
  [2026-01-28 05:26:34] (step=0001164) Train Loss mse: 0.0000, Train Loss ce: 0.1796, Train Steps/Sec: 0.69,
1367
  [2026-01-28 05:26:36] (step=0001165) Train Loss mse: 0.0000, Train Loss ce: 0.1633, Train Steps/Sec: 0.66,
1368
  [2026-01-28 05:26:37] (step=0001166) Train Loss mse: 0.0000, Train Loss ce: 0.1952, Train Steps/Sec: 0.70,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1369
  [2026-01-28 05:26:38] (step=0001167) Train Loss mse: 0.0000, Train Loss ce: 0.1607, Train Steps/Sec: 0.69,
1370
  [2026-01-28 05:26:40] (step=0001168) Train Loss mse: 0.0000, Train Loss ce: 0.1936, Train Steps/Sec: 0.83,
1371
  [2026-01-28 05:26:41] (step=0001169) Train Loss mse: 0.0000, Train Loss ce: 0.1839, Train Steps/Sec: 0.83,
@@ -2897,27 +2897,6 @@ ce_avg: 0.1819312572479248, mse_avg: 0.0
2897
  [2026-01-28 06:00:17] (step=0002695) Train Loss mse: 0.0000, Train Loss ce: 0.1835, Train Steps/Sec: 0.82,
2898
  [2026-01-28 06:00:18] (step=0002696) Train Loss mse: 0.0000, Train Loss ce: 0.1784, Train Steps/Sec: 0.82,
2899
  [2026-01-28 06:00:20] (step=0002697) Train Loss mse: 0.0000, Train Loss ce: 0.1691, Train Steps/Sec: 0.65,
2900
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step2500
2901
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2902
- [eval debug] first 3 batch fingerprints:
2903
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2904
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2905
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2906
- ce_avg: 0.17569424211978912, mse_avg: 0.0
2907
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step3000
2908
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2909
- [eval debug] first 3 batch fingerprints:
2910
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2911
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2912
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2913
- ce_avg: 0.17607633769512177, mse_avg: 0.0
2914
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step3500
2915
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2916
- [eval debug] first 3 batch fingerprints:
2917
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2918
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2919
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2920
- ce_avg: 0.16597864031791687, mse_avg: 0.0
2921
  [2026-01-28 06:00:21] (step=0002698) Train Loss mse: 0.0000, Train Loss ce: 0.1520, Train Steps/Sec: 0.82,
2922
  [2026-01-28 06:00:22] (step=0002699) Train Loss mse: 0.0000, Train Loss ce: 0.1560, Train Steps/Sec: 0.82,
2923
  [2026-01-28 06:00:24] (step=0002700) Train Loss mse: 0.0000, Train Loss ce: 0.1440, Train Steps/Sec: 0.82,
@@ -3001,6 +2980,27 @@ ce_avg: 0.16597864031791687, mse_avg: 0.0
3001
  [2026-01-28 06:02:05] (step=0002778) Train Loss mse: 0.0000, Train Loss ce: 0.1529, Train Steps/Sec: 0.83,
3002
  [2026-01-28 06:02:06] (step=0002779) Train Loss mse: 0.0000, Train Loss ce: 0.1626, Train Steps/Sec: 0.82,
3003
  [2026-01-28 06:02:07] (step=0002780) Train Loss mse: 0.0000, Train Loss ce: 0.1394, Train Steps/Sec: 0.83,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3004
  [2026-01-28 06:02:08] (step=0002781) Train Loss mse: 0.0000, Train Loss ce: 0.1449, Train Steps/Sec: 0.65,
3005
  [2026-01-28 06:02:10] (step=0002782) Train Loss mse: 0.0000, Train Loss ce: 0.1453, Train Steps/Sec: 0.82,
3006
  [2026-01-28 06:02:11] (step=0002783) Train Loss mse: 0.0000, Train Loss ce: 0.1463, Train Steps/Sec: 0.68,
@@ -4017,27 +4017,6 @@ ce_avg: 0.16597864031791687, mse_avg: 0.0
4017
  [2026-01-28 06:24:35] (step=0003794) Train Loss mse: 0.0000, Train Loss ce: 0.1339, Train Steps/Sec: 0.82,
4018
  [2026-01-28 06:24:36] (step=0003795) Train Loss mse: 0.0000, Train Loss ce: 0.1515, Train Steps/Sec: 0.82,
4019
  [2026-01-28 06:24:38] (step=0003796) Train Loss mse: 0.0000, Train Loss ce: 0.1352, Train Steps/Sec: 0.65,
4020
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step4000
4021
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4022
- [eval debug] first 3 batch fingerprints:
4023
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4024
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4025
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4026
- ce_avg: 0.16556128859519958, mse_avg: 0.0
4027
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step4500
4028
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4029
- [eval debug] first 3 batch fingerprints:
4030
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4031
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4032
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4033
- ce_avg: 0.1494830697774887, mse_avg: 0.0
4034
- base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step5000
4035
- Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4036
- [eval debug] first 3 batch fingerprints:
4037
- fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4038
- fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4039
- fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4040
- ce_avg: 0.14704732596874237, mse_avg: 0.0
4041
  [2026-01-28 06:24:39] (step=0003797) Train Loss mse: 0.0000, Train Loss ce: 0.1524, Train Steps/Sec: 0.82,
4042
  [2026-01-28 06:24:40] (step=0003798) Train Loss mse: 0.0000, Train Loss ce: 0.1353, Train Steps/Sec: 0.82,
4043
  [2026-01-28 06:24:42] (step=0003799) Train Loss mse: 0.0000, Train Loss ce: 0.1556, Train Steps/Sec: 0.68,
@@ -4144,6 +4123,27 @@ ce_avg: 0.14704732596874237, mse_avg: 0.0
4144
  [2026-01-28 06:26:51] (step=0003900) Train Loss mse: 0.0000, Train Loss ce: 0.1346, Train Steps/Sec: 0.82,
4145
  [2026-01-28 06:26:52] (step=0003901) Train Loss mse: 0.0000, Train Loss ce: 0.1416, Train Steps/Sec: 0.83,
4146
  [2026-01-28 06:26:53] (step=0003902) Train Loss mse: 0.0000, Train Loss ce: 0.1313, Train Steps/Sec: 0.68,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4147
  [2026-01-28 06:26:55] (step=0003903) Train Loss mse: 0.0000, Train Loss ce: 0.1574, Train Steps/Sec: 0.65,
4148
  [2026-01-28 06:26:56] (step=0003904) Train Loss mse: 0.0000, Train Loss ce: 0.1331, Train Steps/Sec: 0.82,
4149
  [2026-01-28 06:26:57] (step=0003905) Train Loss mse: 0.0000, Train Loss ce: 0.1310, Train Steps/Sec: 0.83,
 
1319
  [2026-01-28 05:25:51] (step=0001131) Train Loss mse: 0.0000, Train Loss ce: 0.1913, Train Steps/Sec: 0.83,
1320
  [2026-01-28 05:25:52] (step=0001132) Train Loss mse: 0.0000, Train Loss ce: 0.2097, Train Steps/Sec: 0.69,
1321
  [2026-01-28 05:25:53] (step=0001133) Train Loss mse: 0.0000, Train Loss ce: 0.1769, Train Steps/Sec: 0.83,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1322
  [2026-01-28 05:25:55] (step=0001134) Train Loss mse: 0.0000, Train Loss ce: 0.1800, Train Steps/Sec: 0.83,
1323
  [2026-01-28 05:25:56] (step=0001135) Train Loss mse: 0.0000, Train Loss ce: 0.1738, Train Steps/Sec: 0.83,
1324
  [2026-01-28 05:25:57] (step=0001136) Train Loss mse: 0.0000, Train Loss ce: 0.2190, Train Steps/Sec: 0.69,
 
1352
  [2026-01-28 05:26:34] (step=0001164) Train Loss mse: 0.0000, Train Loss ce: 0.1796, Train Steps/Sec: 0.69,
1353
  [2026-01-28 05:26:36] (step=0001165) Train Loss mse: 0.0000, Train Loss ce: 0.1633, Train Steps/Sec: 0.66,
1354
  [2026-01-28 05:26:37] (step=0001166) Train Loss mse: 0.0000, Train Loss ce: 0.1952, Train Steps/Sec: 0.70,
1355
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step1500
1356
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
1357
+ [eval debug] first 3 batch fingerprints:
1358
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1359
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1360
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1361
+ ce_avg: 0.18267571926116943, mse_avg: 0.0
1362
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step2000
1363
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
1364
+ [eval debug] first 3 batch fingerprints:
1365
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1366
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1367
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
1368
+ ce_avg: 0.1819312572479248, mse_avg: 0.0
1369
  [2026-01-28 05:26:38] (step=0001167) Train Loss mse: 0.0000, Train Loss ce: 0.1607, Train Steps/Sec: 0.69,
1370
  [2026-01-28 05:26:40] (step=0001168) Train Loss mse: 0.0000, Train Loss ce: 0.1936, Train Steps/Sec: 0.83,
1371
  [2026-01-28 05:26:41] (step=0001169) Train Loss mse: 0.0000, Train Loss ce: 0.1839, Train Steps/Sec: 0.83,
 
2897
  [2026-01-28 06:00:17] (step=0002695) Train Loss mse: 0.0000, Train Loss ce: 0.1835, Train Steps/Sec: 0.82,
2898
  [2026-01-28 06:00:18] (step=0002696) Train Loss mse: 0.0000, Train Loss ce: 0.1784, Train Steps/Sec: 0.82,
2899
  [2026-01-28 06:00:20] (step=0002697) Train Loss mse: 0.0000, Train Loss ce: 0.1691, Train Steps/Sec: 0.65,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2900
  [2026-01-28 06:00:21] (step=0002698) Train Loss mse: 0.0000, Train Loss ce: 0.1520, Train Steps/Sec: 0.82,
2901
  [2026-01-28 06:00:22] (step=0002699) Train Loss mse: 0.0000, Train Loss ce: 0.1560, Train Steps/Sec: 0.82,
2902
  [2026-01-28 06:00:24] (step=0002700) Train Loss mse: 0.0000, Train Loss ce: 0.1440, Train Steps/Sec: 0.82,
 
2980
  [2026-01-28 06:02:05] (step=0002778) Train Loss mse: 0.0000, Train Loss ce: 0.1529, Train Steps/Sec: 0.83,
2981
  [2026-01-28 06:02:06] (step=0002779) Train Loss mse: 0.0000, Train Loss ce: 0.1626, Train Steps/Sec: 0.82,
2982
  [2026-01-28 06:02:07] (step=0002780) Train Loss mse: 0.0000, Train Loss ce: 0.1394, Train Steps/Sec: 0.83,
2983
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step2500
2984
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2985
+ [eval debug] first 3 batch fingerprints:
2986
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2987
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2988
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2989
+ ce_avg: 0.17569424211978912, mse_avg: 0.0
2990
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step3000
2991
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2992
+ [eval debug] first 3 batch fingerprints:
2993
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2994
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2995
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
2996
+ ce_avg: 0.17607633769512177, mse_avg: 0.0
2997
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step3500
2998
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
2999
+ [eval debug] first 3 batch fingerprints:
3000
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
3001
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
3002
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
3003
+ ce_avg: 0.16597864031791687, mse_avg: 0.0
3004
  [2026-01-28 06:02:08] (step=0002781) Train Loss mse: 0.0000, Train Loss ce: 0.1449, Train Steps/Sec: 0.65,
3005
  [2026-01-28 06:02:10] (step=0002782) Train Loss mse: 0.0000, Train Loss ce: 0.1453, Train Steps/Sec: 0.82,
3006
  [2026-01-28 06:02:11] (step=0002783) Train Loss mse: 0.0000, Train Loss ce: 0.1463, Train Steps/Sec: 0.68,
 
4017
  [2026-01-28 06:24:35] (step=0003794) Train Loss mse: 0.0000, Train Loss ce: 0.1339, Train Steps/Sec: 0.82,
4018
  [2026-01-28 06:24:36] (step=0003795) Train Loss mse: 0.0000, Train Loss ce: 0.1515, Train Steps/Sec: 0.82,
4019
  [2026-01-28 06:24:38] (step=0003796) Train Loss mse: 0.0000, Train Loss ce: 0.1352, Train Steps/Sec: 0.65,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4020
  [2026-01-28 06:24:39] (step=0003797) Train Loss mse: 0.0000, Train Loss ce: 0.1524, Train Steps/Sec: 0.82,
4021
  [2026-01-28 06:24:40] (step=0003798) Train Loss mse: 0.0000, Train Loss ce: 0.1353, Train Steps/Sec: 0.82,
4022
  [2026-01-28 06:24:42] (step=0003799) Train Loss mse: 0.0000, Train Loss ce: 0.1556, Train Steps/Sec: 0.68,
 
4123
  [2026-01-28 06:26:51] (step=0003900) Train Loss mse: 0.0000, Train Loss ce: 0.1346, Train Steps/Sec: 0.82,
4124
  [2026-01-28 06:26:52] (step=0003901) Train Loss mse: 0.0000, Train Loss ce: 0.1416, Train Steps/Sec: 0.83,
4125
  [2026-01-28 06:26:53] (step=0003902) Train Loss mse: 0.0000, Train Loss ce: 0.1313, Train Steps/Sec: 0.68,
4126
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step4000
4127
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4128
+ [eval debug] first 3 batch fingerprints:
4129
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4130
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4131
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4132
+ ce_avg: 0.16556128859519958, mse_avg: 0.0
4133
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step4500
4134
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4135
+ [eval debug] first 3 batch fingerprints:
4136
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4137
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4138
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4139
+ ce_avg: 0.1494830697774887, mse_avg: 0.0
4140
+ base_dir is /dev/shm/models/checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins/eval_used_rows, step_tag is checkpoints_vlm_gym_match_move_fix3_unit_one_image_lr2e_5_ce_no_mse_ins_step5000
4141
+ Preparing Dataset vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce/vlm_gym_match_move_fix3_unit_val
4142
+ [eval debug] first 3 batch fingerprints:
4143
+ fp[0]: [{'data_indexes': [0], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4144
+ fp[1]: [{'data_indexes': [8], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4145
+ fp[2]: [{'data_indexes': [16], 'worker_id': 0, 'dataset_name': 'vlm_gym_match_move_fix3_unit_celoss_no_mse_evalonce'}]
4146
+ ce_avg: 0.14704732596874237, mse_avg: 0.0
4147
  [2026-01-28 06:26:55] (step=0003903) Train Loss mse: 0.0000, Train Loss ce: 0.1574, Train Steps/Sec: 0.65,
4148
  [2026-01-28 06:26:56] (step=0003904) Train Loss mse: 0.0000, Train Loss ce: 0.1331, Train Steps/Sec: 0.82,
4149
  [2026-01-28 06:26:57] (step=0003905) Train Loss mse: 0.0000, Train Loss ce: 0.1310, Train Steps/Sec: 0.83,