zelinzang commited on
Commit
718f0f0
Β·
verified Β·
1 Parent(s): 8ea60ae

Upload logs/limb_sweep_i10_bs1000_ex0.5_nu0.3_gpu6.log

Browse files
logs/limb_sweep_i10_bs1000_ex0.5_nu0.3_gpu6.log ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0
  0%| | 0/100 [00:00<?, ?it/s]
1
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 57/100 [00:00<00:00, 567.84it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  0%| | 0/100 [00:00<?, ?it/s]
3
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 51/100 [00:00<00:00, 509.10it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  0%| | 0/100 [00:00<?, ?it/s]
5
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 91/100 [00:00<00:00, 909.50it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  0%| | 0/100 [00:00<?, ?it/s]
7
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 94/100 [00:00<00:00, 935.86it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  0%| | 0/100 [00:00<?, ?it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  0%| | 0/100 [00:00<?, ?it/s]
10
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 48/100 [00:00<00:00, 353.82it/s]
11
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 84/100 [00:00<00:00, 102.94it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  0%| | 0/100 [00:00<?, ?it/s]
13
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 80/100 [00:00<00:00, 790.24it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ COMMAND: CUDA_VISIBLE_DEVICES=6 WANDB_MODE=offline python main.py fit -c conf/difftree/C_Limb_1gpu.yaml --data.K=10 --data.batch_size=1000 --model.ec_ce_weight=0.5 --model.exaggeration_lat=0.5 --model.nu_lat=0.3 --model.weightrout=0.5 --trainer.logger.init_args.name=limb_sweep_i10_bs1000_ex0.5_nu0.3_gpu6 --trainer.enable_progress_bar=False
2
+ Seed set to 42
3
+ GPU available: True (cuda), used: True
4
+ TPU available: False, using: 0 TPU cores
5
+ HPU available: False, using: 0 HPUs
6
+ wandb: Tracking run with wandb version 0.20.1
7
+ wandb: W&B syncing is set to `offline` in this directory. Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
8
+ LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [6]
9
+
10
+ | Name | Type | Params | Mode
11
+ -------------------------------------------------------------------
12
+ 0 | enc | TransformerEncoder | 1.6 M | train
13
+ 1 | UNet_model | AE_layer2 | 64.7 M | train
14
+ 2 | UNet_ema | AE_layer2 | 64.7 M | train
15
+ 3 | tree_node_embedding | ModuleList | 4.1 K | train
16
+ 4 | vis | Sequential | 258 K | train
17
+ 5 | diffusion | GaussianDiffusion | 0 | train
18
+ | other params | n/a | 4 | n/a
19
+ -------------------------------------------------------------------
20
+ 131 M Trainable params
21
+ 0 Non-trainable params
22
+ 131 M Total params
23
+ 525.041 Total estimated model params size (MB)
24
+ 121 Modules in train mode
25
+ 0 Modules in eval mode
26
+ Using fully connected network
27
+ data.shape (66633, 500)
28
+ label (66633,)
29
+ load index trainval_index_train_0.8_0.1_Limb.npy trainval_index_val_0.8_0.1_Limb.npy trainval_index_test_0.8_0.1_Limb.npy from data/trainval_index_train_0.8_0.1_Limb.npy
30
+ train_data.shape (53306, 500) train_label.shape (53306,) val_data.shape (6663, 500) val_label.shape (6663,) test_data.shape (6664, 500) test_label.shape (6664,)
31
+ train_val train
32
+ load data from save_near_index/data_nameLimbK10uselabelFalsepcadim64train_val0.8split_ratio0.1.pkl
33
+ data.shape (66633, 500)
34
+ label (66633,)
35
+ load index trainval_index_train_0.8_0.1_Limb.npy trainval_index_val_0.8_0.1_Limb.npy trainval_index_test_0.8_0.1_Limb.npy from data/trainval_index_train_0.8_0.1_Limb.npy
36
+ train_data.shape (53306, 500) train_label.shape (53306,) val_data.shape (6663, 500) val_label.shape (6663,) test_data.shape (6664, 500) test_label.shape (6664,)
37
+ train_val train
38
+ load data from save_near_index/data_nameLimbK10uselabelFalsepcadim64train_val0.8split_ratio0.1.pkl
39
+ data.shape (66633, 500)
40
+ label (66633,)
41
+ load index trainval_index_train_0.8_0.1_Limb.npy trainval_index_val_0.8_0.1_Limb.npy trainval_index_test_0.8_0.1_Limb.npy from data/trainval_index_train_0.8_0.1_Limb.npy
42
+ train_data.shape (53306, 500) train_label.shape (53306,) val_data.shape (6663, 500) val_label.shape (6663,) test_data.shape (6664, 500) test_label.shape (6664,)
43
+ train_val val
44
+ load data from save_near_index/data_nameLimbK10uselabelFalsepcadim64train_val0.8split_ratio0.1.pkl
45
+ data.shape (66633, 500)
46
+ label (66633,)
47
+ load index trainval_index_train_0.8_0.1_Limb.npy trainval_index_val_0.8_0.1_Limb.npy trainval_index_test_0.8_0.1_Limb.npy from data/trainval_index_train_0.8_0.1_Limb.npy
48
+ train_data.shape (53306, 500) train_label.shape (53306,) val_data.shape (6663, 500) val_label.shape (6663,) test_data.shape (6664, 500) test_label.shape (6664,)
49
+ train_val test
50
+ load data from save_near_index/data_nameLimbK10uselabelFalsepcadim64train_val0.8split_ratio0.1.pkl
51
+ self.training_str step1, epoch 0
52
+ self.training_str step1, epoch 19
53
+ self.training_str step1, epoch 39
54
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
55
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
56
+ self.training_str step2_s, epoch 59
57
+ leaf L2/7 has 1068 samples
58
+ leaf L3/13 has 508 samples
59
+ leaf L3/12 has 1111 samples
60
+ leaf L1/2 has 1495 samples
61
+ leaf L2/3 has 1446 samples
62
+ leaf L4/11 has 772 samples
63
+ leaf L4/10 has 809 samples
64
+ leaf L3/4 has 1057 samples
65
+ leaf L2/1 has 675 samples
66
+ leaf L2/0 has 1059 samples
67
+
68
  0%| | 0/100 [00:00<?, ?it/s]
69
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 57/100 [00:00<00:00, 567.84it/s]
70
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
71
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
72
+ self.training_str step2_s, epoch 79
73
+ leaf L2/7 has 1048 samples
74
+ leaf L2/6 has 1675 samples
75
+ leaf L1/2 has 1522 samples
76
+ leaf L2/3 has 1410 samples
77
+ leaf L4/11 has 781 samples
78
+ leaf L4/10 has 765 samples
79
+ leaf L3/4 has 1052 samples
80
+ leaf L2/1 has 638 samples
81
+ leaf L3/1 has 561 samples
82
+ leaf L3/0 has 548 samples
83
+
84
  0%| | 0/100 [00:00<?, ?it/s]
85
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 51/100 [00:00<00:00, 509.10it/s]
86
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
87
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
88
+ self.training_str step2_s, epoch 99
89
+ leaf L2/7 has 1077 samples
90
+ leaf L3/13 has 493 samples
91
+ leaf L3/12 has 1119 samples
92
+ leaf L1/2 has 1492 samples
93
+ leaf L2/3 has 1516 samples
94
+ leaf L4/11 has 787 samples
95
+ leaf L4/10 has 787 samples
96
+ leaf L3/4 has 986 samples
97
+ leaf L2/1 has 700 samples
98
+ leaf L2/0 has 1043 samples
99
+
100
  0%| | 0/100 [00:00<?, ?it/s]
101
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 91/100 [00:00<00:00, 909.50it/s]
102
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
103
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
104
+ self.training_str step2_r, epoch 119
105
+ leaf L2/7 has 1040 samples
106
+ leaf L3/13 has 524 samples
107
+ leaf L3/12 has 1155 samples
108
+ leaf L1/2 has 1507 samples
109
+ leaf L2/3 has 1468 samples
110
+ leaf L4/11 has 778 samples
111
+ leaf L4/10 has 765 samples
112
+ leaf L3/4 has 1022 samples
113
+ leaf L2/1 has 718 samples
114
+ leaf L2/0 has 1023 samples
115
+
116
  0%| | 0/100 [00:00<?, ?it/s]
117
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 94/100 [00:00<00:00, 935.86it/s]
118
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
119
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
120
+ self.training_str step2_r, epoch 139
121
+ leaf L2/7 has 996 samples
122
+ leaf L3/13 has 510 samples
123
+ leaf L3/12 has 1161 samples
124
+ leaf L1/2 has 1456 samples
125
+ leaf L2/3 has 1451 samples
126
+ leaf L4/11 has 763 samples
127
+ leaf L4/10 has 798 samples
128
+ leaf L3/4 has 1036 samples
129
+ leaf L2/1 has 730 samples
130
+ leaf L2/0 has 1099 samples
131
+
132
  0%| | 0/100 [00:00<?, ?it/s]
133
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
134
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
135
+ /root/project/hdtree/code_HDTree_review/call_backs/util.py:363: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
136
+ fig = plt.figure(figsize=(10, 10))
137
+ self.training_str step2_r, epoch 159
138
+ /root/project/hdtree/code_HDTree_review/call_backs/util.py:747: RuntimeWarning: More than 20 figures have been opened. Figures created through the pyplot interface (`matplotlib.pyplot.figure`) are retained until explicitly closed and may consume too much memory. (To control this warning, see the rcParam `figure.max_open_warning`). Consider using `matplotlib.pyplot.close()`.
139
+ fig, ax = plt.subplots(figsize=(12, 8)) # Adjust according to your needs
140
+ leaf L2/7 has 1045 samples
141
+ leaf L3/13 has 533 samples
142
+ leaf L3/12 has 1053 samples
143
+ leaf L1/2 has 1484 samples
144
+ leaf L2/3 has 1436 samples
145
+ leaf L4/11 has 830 samples
146
+ leaf L4/10 has 755 samples
147
+ leaf L3/4 has 1052 samples
148
+ leaf L2/1 has 736 samples
149
+ leaf L2/0 has 1076 samples
150
+
151
  0%| | 0/100 [00:00<?, ?it/s]
152
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 48/100 [00:00<00:00, 353.82it/s]
153
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 84/100 [00:00<00:00, 102.94it/s]
154
+ /opt/miniforge3/envs/benchmark/lib/python3.9/site-packages/torch/nn/modules/instancenorm.py:80: UserWarning: input's size at dim=0 does not match num_features. You can silence this warning by not passing in num_features, which is not used because affine=False
155
+ warnings.warn(f"input's size at dim={feature_dim} does not match num_features. "
156
+ self.training_str step2_r, epoch 179
157
+ leaf L2/7 has 998 samples
158
+ leaf L3/13 has 523 samples
159
+ leaf L3/12 has 1114 samples
160
+ leaf L1/2 has 1554 samples
161
+ leaf L2/3 has 1435 samples
162
+ leaf L4/11 has 802 samples
163
+ leaf L4/10 has 802 samples
164
+ leaf L3/4 has 997 samples
165
+ leaf L2/1 has 665 samples
166
+ leaf L2/0 has 1110 samples
167
+
168
  0%| | 0/100 [00:00<?, ?it/s]
169
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 80/100 [00:00<00:00, 790.24it/s]
170
+ `Trainer.fit` stopped: `max_epochs=200` reached.
171
+ wandb:
172
+ wandb:
173
+ wandb: Run history:
174
+ wandb: epoch β–β–β–β–β–β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–„β–„β–…β–…β–…β–…β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–‡β–‡β–‡β–‡β–ˆ
175
+ wandb: loss_all β–ˆβ–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–
176
+ wandb: loss_diff β–…β–ƒβ–β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–
177
+ wandb: loss_emb β–ˆβ–‚β–‚β–‚β–‚β–β–‚β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–
178
+ wandb: loss_lat β–ˆβ–ƒβ–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–β–‚β–β–β–β–β–β–β–β–
179
+ wandb: loss_rute β–‡β–ƒβ–‚β–„β–†β–„β–„β–β–β–‡β–†β–β–ƒβ–ƒβ–„β–„β–ƒβ–‚β–ƒβ–‚β–ƒβ–ƒβ–β–…β–‚β–‚β–„β–‚β–β–…β–‚β–β–β–ˆβ–β–ƒβ–ˆβ–‚β–…β–‚
180
+ wandb: lr β–†β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–†β–†β–†β–†β–†β–…β–…β–„β–„β–„β–„β–„β–„β–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–β–β–β–β–β–
181
+ wandb: orthogonal_loss ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
182
+ wandb: rout/svc_acc ▁▁▁▁▁▁▁
183
+ wandb: train_svc β–β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆ
184
+ wandb: train_svc_rbf β–β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
185
+ wandb: trainer/global_step β–β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–…β–…β–…β–…β–…β–†β–†β–†β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆ
186
+ wandb: tree/ari_0 β–‡β–β–†β–†β–‡β–†β–ˆ
187
+ wandb: tree/cluster_acc_0 β–‡β–β–‡β–‡β–‡β–†β–ˆ
188
+ wandb: tree/dp_0 β–†β–β–…β–†β–‡β–‡β–ˆ
189
+ wandb: tree/log_likelihood_0 ▁▁▁▁▁▁▁
190
+ wandb: tree/lp_0 β–„β–β–…β–ˆβ–†β–…β–ˆ
191
+ wandb: tree/nmi_0 β–†β–β–†β–‡β–†β–†β–ˆ
192
+ wandb: tree/reconstruction_loss_0 ▁▁▁▁▁▁▁
193
+ wandb:
194
+ wandb: Run summary:
195
+ wandb: epoch 199
196
+ wandb: loss_all 2e-05
197
+ wandb: loss_diff 0.24769
198
+ wandb: loss_emb 2.01691
199
+ wandb: loss_lat 2.00958
200
+ wandb: loss_rute 0.28223
201
+ wandb: lr 0.0
202
+ wandb: orthogonal_loss 0
203
+ wandb: rout/svc_acc 1
204
+ wandb: train_svc 0.53921
205
+ wandb: train_svc_rbf 0.71421
206
+ wandb: trainer/global_step 532999
207
+ wandb: tree/ari_0 0.3861
208
+ wandb: tree/cluster_acc_0 0.5286
209
+ wandb: tree/dp_0 0.41029
210
+ wandb: tree/log_likelihood_0 0
211
+ wandb: tree/lp_0 0.5837
212
+ wandb: tree/nmi_0 0.49042
213
+ wandb: tree/reconstruction_loss_0 0
214
+ wandb:
215
+ wandb: You can sync this run to the cloud by running:
216
+ wandb: wandb sync wandb/wandb/offline-run-20260518_092347-segtfz2w
217
+ wandb: Find logs at: wandb/wandb/offline-run-20260518_092347-segtfz2w/logs
218
+ self.training_str step2_r, epoch 199
219
+ ==== END 2026-05-18_12:11:30 limb_sweep_i10_bs1000_ex0.5_nu0.3_gpu6 status=0 worker=extra_g6 ====