Junyi42 commited on
Commit
9afadba
·
verified ·
1 Parent(s): 9503ef9

Upload checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins

Browse files
checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins/wandb/offline-run-20260125_041958-checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins-run0/files/config.yaml CHANGED
@@ -0,0 +1,185 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ _wandb:
4
+ desc: null
5
+ value:
6
+ python_version: 3.11.10
7
+ cli_version: 0.23.1
8
+ framework: huggingface
9
+ huggingface_version: 4.49.0
10
+ is_jupyter_run: false
11
+ is_kaggle_kernel: false
12
+ start_time: 1769314798
13
+ t:
14
+ 1:
15
+ - 1
16
+ - 5
17
+ - 11
18
+ - 41
19
+ - 49
20
+ - 53
21
+ - 71
22
+ - 105
23
+ 2:
24
+ - 1
25
+ - 5
26
+ - 11
27
+ - 41
28
+ - 49
29
+ - 53
30
+ - 71
31
+ - 105
32
+ 3:
33
+ - 4
34
+ - 13
35
+ - 14
36
+ - 37
37
+ - 42
38
+ 4: 3.11.10
39
+ 5: 0.23.1
40
+ 6: 4.49.0
41
+ 13: linux-x86_64
42
+ e:
43
+ ksvyazoze3bbxmja1hzf6tq6tcla6i9q:
44
+ os: Linux-6.6.93+-x86_64-with-glibc2.35
45
+ python: CPython 3.11.10
46
+ started_at: '2026-01-25T04:19:58.060907Z'
47
+ args:
48
+ - --dataset_config_file
49
+ - ./data/configs/vlm_gym_counting_mark_all_train_celoss.yaml
50
+ - --eval_dataset_config_file
51
+ - ./data/configs/vlm_gym_counting_mark_all_eval_celoss.yaml
52
+ - --viz_dataset_config_file
53
+ - ./data/configs/vlm_gym_counting_mark_all_eval_celoss.yaml
54
+ - --inference_hash_file
55
+ - /home/clouduser/Code/Github/launch_new/hashes_test_set_v10.json
56
+ - --task_name
57
+ - counting-mark_all_v5
58
+ - --instructions_dir
59
+ - ./data/instructions
60
+ - --train_data_dir
61
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/train/
62
+ - --train_jsonl_path
63
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/train/
64
+ - --eval_data_dir
65
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/val/
66
+ - --eval_jsonl_path
67
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/val/
68
+ - --model_path
69
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
70
+ - --layer_module
71
+ - Qwen2MoTDecoderLayer
72
+ - --max_latent_size
73
+ - '64'
74
+ - --resume-from
75
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
76
+ - --finetune_from_hf
77
+ - 'True'
78
+ - --auto_resume
79
+ - 'False'
80
+ - --resume-model-only
81
+ - 'True'
82
+ - --finetune-from-ema
83
+ - 'True'
84
+ - --log_every
85
+ - '1'
86
+ - --lr
87
+ - 2e-5
88
+ - --warmup_steps
89
+ - '300'
90
+ - --lr_scheduler
91
+ - cosine
92
+ - --num_worker
93
+ - '1'
94
+ - --expected_num_tokens
95
+ - '20000'
96
+ - --max_num_tokens
97
+ - '20000'
98
+ - --max_num_tokens_per_sample
99
+ - '20000'
100
+ - --visual_und
101
+ - 'True'
102
+ - --save_every
103
+ - '2500'
104
+ - --total_steps
105
+ - '5000'
106
+ - --text_cond_dropout_prob
107
+ - '0.0'
108
+ - --vae_cond_dropout_prob
109
+ - '0.3'
110
+ - --vit_cond_dropout_prob
111
+ - '0.0'
112
+ - --ema
113
+ - '0.993'
114
+ - --checkpoint_dir
115
+ - /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
116
+ - --wandb_project
117
+ - bagel
118
+ - --wandb_name
119
+ - checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
120
+ - --wandb_dir
121
+ - /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
122
+ - --wandb_offline
123
+ - 'True'
124
+ program: /home/clouduser/Code/Github/unified_world_model/train/pretrain_unified_navit.py
125
+ code_path: train/pretrain_unified_navit.py
126
+ code_path_local: train/pretrain_unified_navit.py
127
+ git:
128
+ remote_url: https://github.com/para-lost/unified_world_model
129
+ commit: d317a65e0c29654563243718688a08f3ca47dcfb
130
+ root: /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
131
+ host: junyizhang-launch-new-225770736-1-0
132
+ executable: /opt/conda/bin/python3.11
133
+ cpu_count: 48
134
+ cpu_count_logical: 96
135
+ gpu_type: NVIDIA A100-SXM4-80GB
136
+ gpu_count: 8
137
+ disk:
138
+ /:
139
+ total: '1052461830144'
140
+ used: '171656708096'
141
+ memory:
142
+ total: '1437332606976'
143
+ gpu_nvidia:
144
+ - name: NVIDIA A100-SXM4-80GB
145
+ memory_total: '85899345920'
146
+ cuda_cores: 6912
147
+ architecture: Ampere
148
+ uuid: GPU-7f42c3e6-3e00-16d9-8050-778cdc5844ff
149
+ - name: NVIDIA A100-SXM4-80GB
150
+ memory_total: '85899345920'
151
+ cuda_cores: 6912
152
+ architecture: Ampere
153
+ uuid: GPU-fe3cf526-4d02-9998-20ad-638818e980dc
154
+ - name: NVIDIA A100-SXM4-80GB
155
+ memory_total: '85899345920'
156
+ cuda_cores: 6912
157
+ architecture: Ampere
158
+ uuid: GPU-179211fd-ed39-e90f-1966-78e4b0992577
159
+ - name: NVIDIA A100-SXM4-80GB
160
+ memory_total: '85899345920'
161
+ cuda_cores: 6912
162
+ architecture: Ampere
163
+ uuid: GPU-dab97901-feff-9eb9-0ceb-2ea55cfee6bb
164
+ - name: NVIDIA A100-SXM4-80GB
165
+ memory_total: '85899345920'
166
+ cuda_cores: 6912
167
+ architecture: Ampere
168
+ uuid: GPU-9454f07a-fddd-f326-972c-b57894185709
169
+ - name: NVIDIA A100-SXM4-80GB
170
+ memory_total: '85899345920'
171
+ cuda_cores: 6912
172
+ architecture: Ampere
173
+ uuid: GPU-d7f547d0-5fa6-90d9-5f6d-3b8f6afd71b5
174
+ - name: NVIDIA A100-SXM4-80GB
175
+ memory_total: '85899345920'
176
+ cuda_cores: 6912
177
+ architecture: Ampere
178
+ uuid: GPU-71281f15-3809-23cd-7f39-52a70b28e4c4
179
+ - name: NVIDIA A100-SXM4-80GB
180
+ memory_total: '85899345920'
181
+ cuda_cores: 6912
182
+ architecture: Ampere
183
+ uuid: GPU-0388b0d9-935a-e627-0892-121f7bafd6ef
184
+ cuda_version: '12.2'
185
+ writer_id: ksvyazoze3bbxmja1hzf6tq6tcla6i9q
checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins/wandb/offline-run-20260125_205640-checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins-run0/files/config.yaml CHANGED
@@ -0,0 +1,185 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ _wandb:
4
+ desc: null
5
+ value:
6
+ python_version: 3.11.10
7
+ cli_version: 0.23.1
8
+ framework: huggingface
9
+ huggingface_version: 4.49.0
10
+ is_jupyter_run: false
11
+ is_kaggle_kernel: false
12
+ start_time: 1769374601
13
+ t:
14
+ 1:
15
+ - 1
16
+ - 5
17
+ - 11
18
+ - 41
19
+ - 49
20
+ - 53
21
+ - 71
22
+ - 105
23
+ 2:
24
+ - 1
25
+ - 5
26
+ - 11
27
+ - 41
28
+ - 49
29
+ - 53
30
+ - 71
31
+ - 105
32
+ 3:
33
+ - 4
34
+ - 13
35
+ - 14
36
+ - 37
37
+ - 42
38
+ 4: 3.11.10
39
+ 5: 0.23.1
40
+ 6: 4.49.0
41
+ 13: linux-x86_64
42
+ e:
43
+ qivwap4wkbv1rrpyy3tc8ok85jcfa4gf:
44
+ os: Linux-6.6.93+-x86_64-with-glibc2.35
45
+ python: CPython 3.11.10
46
+ started_at: '2026-01-25T20:56:40.641636Z'
47
+ args:
48
+ - --dataset_config_file
49
+ - ./data/configs/vlm_gym_counting_mark_all_train_celoss.yaml
50
+ - --eval_dataset_config_file
51
+ - ./data/configs/vlm_gym_counting_mark_all_eval_celoss.yaml
52
+ - --viz_dataset_config_file
53
+ - ./data/configs/vlm_gym_counting_mark_all_eval_celoss.yaml
54
+ - --inference_hash_file
55
+ - /home/clouduser/Code/Github/launch_new/hashes_test_set_v10.json
56
+ - --task_name
57
+ - counting-mark_all_v5
58
+ - --instructions_dir
59
+ - ./data/instructions
60
+ - --train_data_dir
61
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/train/
62
+ - --train_jsonl_path
63
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/train/
64
+ - --eval_data_dir
65
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/val/
66
+ - --eval_jsonl_path
67
+ - /home/clouduser/Code/data/gym/counting-mark_all_v5/val/
68
+ - --model_path
69
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
70
+ - --layer_module
71
+ - Qwen2MoTDecoderLayer
72
+ - --max_latent_size
73
+ - '64'
74
+ - --resume-from
75
+ - /home/clouduser/Code/Models/BAGEL-7B-MoT
76
+ - --finetune_from_hf
77
+ - 'True'
78
+ - --auto_resume
79
+ - 'False'
80
+ - --resume-model-only
81
+ - 'True'
82
+ - --finetune-from-ema
83
+ - 'True'
84
+ - --log_every
85
+ - '1'
86
+ - --lr
87
+ - 2e-5
88
+ - --warmup_steps
89
+ - '300'
90
+ - --lr_scheduler
91
+ - cosine
92
+ - --num_worker
93
+ - '1'
94
+ - --expected_num_tokens
95
+ - '30000'
96
+ - --max_num_tokens
97
+ - '30000'
98
+ - --max_num_tokens_per_sample
99
+ - '30000'
100
+ - --visual_und
101
+ - 'True'
102
+ - --save_every
103
+ - '2500'
104
+ - --total_steps
105
+ - '5000'
106
+ - --text_cond_dropout_prob
107
+ - '0.0'
108
+ - --vae_cond_dropout_prob
109
+ - '0.3'
110
+ - --vit_cond_dropout_prob
111
+ - '0.0'
112
+ - --ema
113
+ - '0.993'
114
+ - --checkpoint_dir
115
+ - /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
116
+ - --wandb_project
117
+ - bagel
118
+ - --wandb_name
119
+ - checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
120
+ - --wandb_dir
121
+ - /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
122
+ - --wandb_offline
123
+ - 'True'
124
+ program: /home/clouduser/Code/Github/unified_world_model/train/pretrain_unified_navit.py
125
+ code_path: train/pretrain_unified_navit.py
126
+ code_path_local: train/pretrain_unified_navit.py
127
+ git:
128
+ remote_url: https://github.com/para-lost/unified_world_model
129
+ commit: 45495bf06d28509bc54cbbda532f4b97404a7d66
130
+ root: /dev/shm/models/checkpoints_vlm_gym_counting_mark_all_one_image_lr2e_5_ce_ins
131
+ host: junyizhang-launch-new-225900672-1-0
132
+ executable: /opt/conda/bin/python3.11
133
+ cpu_count: 48
134
+ cpu_count_logical: 96
135
+ gpu_type: NVIDIA A100-SXM4-80GB
136
+ gpu_count: 8
137
+ disk:
138
+ /:
139
+ total: '1052461830144'
140
+ used: '354507616256'
141
+ memory:
142
+ total: '1437332611072'
143
+ gpu_nvidia:
144
+ - name: NVIDIA A100-SXM4-80GB
145
+ memory_total: '85899345920'
146
+ cuda_cores: 6912
147
+ architecture: Ampere
148
+ uuid: GPU-27013fed-9784-d445-a1eb-01629cf403cc
149
+ - name: NVIDIA A100-SXM4-80GB
150
+ memory_total: '85899345920'
151
+ cuda_cores: 6912
152
+ architecture: Ampere
153
+ uuid: GPU-c4922cf6-bc87-9458-c12f-23210cb43686
154
+ - name: NVIDIA A100-SXM4-80GB
155
+ memory_total: '85899345920'
156
+ cuda_cores: 6912
157
+ architecture: Ampere
158
+ uuid: GPU-1af9405a-c062-486e-383f-7ea6c6ef5158
159
+ - name: NVIDIA A100-SXM4-80GB
160
+ memory_total: '85899345920'
161
+ cuda_cores: 6912
162
+ architecture: Ampere
163
+ uuid: GPU-793b7211-7436-7429-8bd7-cc05be70cc75
164
+ - name: NVIDIA A100-SXM4-80GB
165
+ memory_total: '85899345920'
166
+ cuda_cores: 6912
167
+ architecture: Ampere
168
+ uuid: GPU-5eb44009-8d7d-911d-0730-f219cb50498c
169
+ - name: NVIDIA A100-SXM4-80GB
170
+ memory_total: '85899345920'
171
+ cuda_cores: 6912
172
+ architecture: Ampere
173
+ uuid: GPU-62c85054-47c8-b915-18e9-e4433fc0f9bb
174
+ - name: NVIDIA A100-SXM4-80GB
175
+ memory_total: '85899345920'
176
+ cuda_cores: 6912
177
+ architecture: Ampere
178
+ uuid: GPU-c3b59f2c-b6b6-7730-54ff-8cf5fee4ea9c
179
+ - name: NVIDIA A100-SXM4-80GB
180
+ memory_total: '85899345920'
181
+ cuda_cores: 6912
182
+ architecture: Ampere
183
+ uuid: GPU-e988baaf-6bc5-3bb9-91fb-ab2cb214233d
184
+ cuda_version: '12.2'
185
+ writer_id: qivwap4wkbv1rrpyy3tc8ok85jcfa4gf