speedinghzl commited on
Commit
b270c94
·
verified ·
1 Parent(s): 93f8673

Upload folder using huggingface_hub

Browse files
clipcls_vit_l16_s512m_bs16k_mix0_0/checkpoints/epoch_4.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ead037a7f61443a64d269f53ccfc379473b3ad88514c586de16da34d8e3b8bc7
3
+ size 5741584280
clipcls_vit_l16_s512m_bs16k_mix0_0/out.log ADDED
@@ -0,0 +1,504 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-05-06,21:30:00 | INFO | No latest resume checkpoint found in ./logs-lr1e-3-datacomp/clipcls_vit_l16_s512m_bs16k_mix0_0/checkpoints.
2
+ 2025-05-06,21:30:02 | INFO | Running in distributed mode with multiple processes. Device: cuda:0.Process (global: 0, local 0), total 32.
3
+ 2025-05-06,21:30:02 | INFO | Loaded CLIPCLS-ViT-L-16 model config.
4
+ 2025-05-06,21:30:07 | INFO | Model:
5
+ 2025-05-06,21:30:07 | INFO | CLIPCLS(
6
+ (visual): VisionTransformer(
7
+ (conv1): Conv2d(3, 1024, kernel_size=(16, 16), stride=(16, 16), bias=False)
8
+ (patch_dropout): Identity()
9
+ (ln_pre): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
10
+ (transformer): Transformer(
11
+ (resblocks): ModuleList(
12
+ (0-23): 24 x ResidualAttentionBlock(
13
+ (ln_1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
14
+ (attn): MultiheadAttention(
15
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=1024, out_features=1024, bias=True)
16
+ )
17
+ (ls_1): Identity()
18
+ (ln_2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
19
+ (mlp): Sequential(
20
+ (c_fc): Linear(in_features=1024, out_features=4096, bias=True)
21
+ (gelu): GELU(approximate='none')
22
+ (c_proj): Linear(in_features=4096, out_features=1024, bias=True)
23
+ )
24
+ (ls_2): Identity()
25
+ )
26
+ )
27
+ )
28
+ (ln_post): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
29
+ )
30
+ (text): TextTransformer(
31
+ (token_embedding): Embedding(49408, 768)
32
+ (transformer): Transformer(
33
+ (resblocks): ModuleList(
34
+ (0-11): 12 x ResidualAttentionBlock(
35
+ (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
36
+ (attn): MultiheadAttention(
37
+ (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
38
+ )
39
+ (ls_1): Identity()
40
+ (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
41
+ (mlp): Sequential(
42
+ (c_fc): Linear(in_features=768, out_features=3072, bias=True)
43
+ (gelu): GELU(approximate='none')
44
+ (c_proj): Linear(in_features=3072, out_features=768, bias=True)
45
+ )
46
+ (ls_2): Identity()
47
+ )
48
+ )
49
+ )
50
+ (ln_final): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
51
+ )
52
+ (text_decoder): MixClsHead(
53
+ (mlps): ModuleList()
54
+ (ln_mlp): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
55
+ (text_projection): Linear(in_features=1024, out_features=49408, bias=True)
56
+ )
57
+ )
58
+ 2025-05-06,21:30:07 | INFO | Params:
59
+ 2025-05-06,21:30:07 | INFO | NDR_patch_size: 16
60
+ 2025-05-06,21:30:07 | INFO | accum_freq: 1
61
+ 2025-05-06,21:30:07 | INFO | aug_cfg: {}
62
+ 2025-05-06,21:30:07 | INFO | batch_size: 512
63
+ 2025-05-06,21:30:07 | INFO | beta1: 0.9
64
+ 2025-05-06,21:30:07 | INFO | beta2: 0.98
65
+ 2025-05-06,21:30:07 | INFO | checkpoint_path: ./logs-lr1e-3-datacomp/clipcls_vit_l16_s512m_bs16k_mix0_0/checkpoints
66
+ 2025-05-06,21:30:07 | INFO | coca_caption_loss_weight: 2.0
67
+ 2025-05-06,21:30:07 | INFO | coca_contrastive_loss_weight: 1.0
68
+ 2025-05-06,21:30:07 | INFO | copy_codebase: False
69
+ 2025-05-06,21:30:07 | INFO | csv_caption_key: title
70
+ 2025-05-06,21:30:07 | INFO | csv_img_key: filepath
71
+ 2025-05-06,21:30:07 | INFO | csv_separator:
72
+ 2025-05-06,21:30:07 | INFO | dataset_resampled: False
73
+ 2025-05-06,21:30:07 | INFO | dataset_type: webdataset
74
+ 2025-05-06,21:30:07 | INFO | ddp_static_graph: True
75
+ 2025-05-06,21:30:07 | INFO | debug: False
76
+ 2025-05-06,21:30:07 | INFO | delete_prev_step_ckpt: True
77
+ 2025-05-06,21:30:07 | INFO | delete_previous_checkpoint: False
78
+ 2025-05-06,21:30:07 | INFO | device: cuda:0
79
+ 2025-05-06,21:30:07 | INFO | dist_backend: nccl
80
+ 2025-05-06,21:30:07 | INFO | dist_url: env://
81
+ 2025-05-06,21:30:07 | INFO | distill: False
82
+ 2025-05-06,21:30:07 | INFO | distill_model: None
83
+ 2025-05-06,21:30:07 | INFO | distill_pretrained: None
84
+ 2025-05-06,21:30:07 | INFO | distributed: True
85
+ 2025-05-06,21:30:07 | INFO | epochs: 4
86
+ 2025-05-06,21:30:07 | INFO | epochs_cooldown: None
87
+ 2025-05-06,21:30:07 | INFO | eps: 1e-06
88
+ 2025-05-06,21:30:07 | INFO | force_custom_text: False
89
+ 2025-05-06,21:30:07 | INFO | force_image_size: 224
90
+ 2025-05-06,21:30:07 | INFO | force_patch_dropout: None
91
+ 2025-05-06,21:30:07 | INFO | force_quick_gelu: False
92
+ 2025-05-06,21:30:07 | INFO | gather_with_grad: True
93
+ 2025-05-06,21:30:07 | INFO | global_batch_size: 16384
94
+ 2025-05-06,21:30:07 | INFO | grad_checkpointing: True
95
+ 2025-05-06,21:30:07 | INFO | grad_clip_norm: None
96
+ 2025-05-06,21:30:07 | INFO | horovod: False
97
+ 2025-05-06,21:30:07 | INFO | image_interpolation: None
98
+ 2025-05-06,21:30:07 | INFO | image_mean: None
99
+ 2025-05-06,21:30:07 | INFO | image_resize_mode: None
100
+ 2025-05-06,21:30:07 | INFO | image_std: None
101
+ 2025-05-06,21:30:07 | INFO | imagenet_v2: None
102
+ 2025-05-06,21:30:07 | INFO | imagenet_val: /mnt/bn/zilongdata-hl/dataset/imagenet/val
103
+ 2025-05-06,21:30:07 | INFO | is_cls_token: True
104
+ 2025-05-06,21:30:07 | INFO | local_loss: True
105
+ 2025-05-06,21:30:07 | INFO | local_rank: 0
106
+ 2025-05-06,21:30:07 | INFO | lock_image: False
107
+ 2025-05-06,21:30:07 | INFO | lock_image_freeze_bn_stats: False
108
+ 2025-05-06,21:30:07 | INFO | lock_image_unlocked_groups: 0
109
+ 2025-05-06,21:30:07 | INFO | lock_text: False
110
+ 2025-05-06,21:30:07 | INFO | lock_text_freeze_layer_norm: False
111
+ 2025-05-06,21:30:07 | INFO | lock_text_unlocked_layers: 0
112
+ 2025-05-06,21:30:07 | INFO | log_every_n_steps: 128
113
+ 2025-05-06,21:30:07 | INFO | log_level: 20
114
+ 2025-05-06,21:30:07 | INFO | log_local: False
115
+ 2025-05-06,21:30:07 | INFO | log_path: ./logs-lr1e-3-datacomp/clipcls_vit_l16_s512m_bs16k_mix0_0/out.log
116
+ 2025-05-06,21:30:07 | INFO | logs: ./logs-lr1e-3-datacomp
117
+ 2025-05-06,21:30:07 | INFO | lr: 0.001
118
+ 2025-05-06,21:30:07 | INFO | lr_cooldown_end: 0.0
119
+ 2025-05-06,21:30:07 | INFO | lr_cooldown_power: 1.0
120
+ 2025-05-06,21:30:07 | INFO | lr_scheduler: cosine
121
+ 2025-05-06,21:30:07 | INFO | max_seq_len: 15000
122
+ 2025-05-06,21:30:07 | INFO | model: CLIPCLS-ViT-L-16
123
+ 2025-05-06,21:30:07 | INFO | name: clipcls_vit_l16_s512m_bs16k_mix0_0
124
+ 2025-05-06,21:30:07 | INFO | native_dynamic_resolution: False
125
+ 2025-05-06,21:30:07 | INFO | no_set_device_rank: False
126
+ 2025-05-06,21:30:07 | INFO | only_packing: False
127
+ 2025-05-06,21:30:07 | INFO | precision: amp
128
+ 2025-05-06,21:30:07 | INFO | pretrained:
129
+ 2025-05-06,21:30:07 | INFO | pretrained_image:
130
+ 2025-05-06,21:30:07 | INFO | pretrained_text:
131
+ 2025-05-06,21:30:07 | INFO | rank: 0
132
+ 2025-05-06,21:30:07 | INFO | remote_sync: None
133
+ 2025-05-06,21:30:07 | INFO | remote_sync_frequency: 300
134
+ 2025-05-06,21:30:07 | INFO | remote_sync_protocol: s3
135
+ 2025-05-06,21:30:07 | INFO | report_to: wandb
136
+ 2025-05-06,21:30:07 | INFO | resume: None
137
+ 2025-05-06,21:30:07 | INFO | rope_attn_num_heads: 12
138
+ 2025-05-06,21:30:07 | INFO | rope_model_width: 768
139
+ 2025-05-06,21:30:07 | INFO | save_every_n_steps: 6104
140
+ 2025-05-06,21:30:07 | INFO | save_frequency: 1
141
+ 2025-05-06,21:30:07 | INFO | save_most_recent: False
142
+ 2025-05-06,21:30:07 | INFO | seed: 0
143
+ 2025-05-06,21:30:07 | INFO | siglip: False
144
+ 2025-05-06,21:30:07 | INFO | skip_scheduler: False
145
+ 2025-05-06,21:30:07 | INFO | tensorboard: False
146
+ 2025-05-06,21:30:07 | INFO | tensorboard_path:
147
+ 2025-05-06,21:30:07 | INFO | torchcompile: False
148
+ 2025-05-06,21:30:07 | INFO | torchscript: False
149
+ 2025-05-06,21:30:07 | INFO | trace: False
150
+ 2025-05-06,21:30:07 | INFO | train_data: /mnt/bn/zilongdata-hl/dataset/Recap-DataComp-1B-Dataset/{000000..140146}.tar
151
+ 2025-05-06,21:30:07 | INFO | train_data_upsampling_factors: None
152
+ 2025-05-06,21:30:07 | INFO | train_num_samples: 128000000
153
+ 2025-05-06,21:30:07 | INFO | use_bn_sync: False
154
+ 2025-05-06,21:30:07 | INFO | use_bnb_linear: None
155
+ 2025-05-06,21:30:07 | INFO | val_data: None
156
+ 2025-05-06,21:30:07 | INFO | val_frequency: 1
157
+ 2025-05-06,21:30:07 | INFO | val_num_samples: None
158
+ 2025-05-06,21:30:07 | INFO | val_steps: 0
159
+ 2025-05-06,21:30:07 | INFO | wandb: True
160
+ 2025-05-06,21:30:07 | INFO | wandb_notes:
161
+ 2025-05-06,21:30:07 | INFO | wandb_project_name: cls-clip-NDR
162
+ 2025-05-06,21:30:07 | INFO | warmup: 500
163
+ 2025-05-06,21:30:07 | INFO | wd: 0.2
164
+ 2025-05-06,21:30:07 | INFO | workers: 1
165
+ 2025-05-06,21:30:07 | INFO | world_size: 32
166
+ 2025-05-06,21:30:07 | INFO | zeroshot_frequency: 4
167
+ 2025-05-06,21:30:07 | INFO | zeroshot_steps: 0
168
+ 2025-05-06,21:30:25 | INFO | Start epoch 0
169
+ 2025-05-06,21:30:38 | INFO | Train Epoch: 0 [ 16384/128008192 (0%)] Data (t): 4.617 Batch (t): 13.540, 1210.07/s, 37.8148/s/gpu LR: 0.000002 Logit Scale: 14.286 Class_loss: 11.283 (11.283) Contrastive_loss: 9.7358 (9.7358) Loss: 21.018 (21.018)
170
+ 2025-05-06,21:44:19 | INFO | Train Epoch: 0 [ 2113536/128008192 (2%)] Data (t): 0.168 Batch (t): 6.415, 2563.25/s, 80.1017/s/gpu LR: 0.000258 Logit Scale: 14.284 Class_loss: 7.6540 (9.4683) Contrastive_loss: 7.8739 (8.8048) Loss: 15.528 (18.273)
171
+ 2025-05-06,21:57:59 | INFO | Train Epoch: 0 [ 4210688/128008192 (3%)] Data (t): 0.169 Batch (t): 6.404, 2559.81/s, 79.9940/s/gpu LR: 0.000514 Logit Scale: 14.368 Class_loss: 7.4017 (8.7795) Contrastive_loss: 7.3922 (8.3339) Loss: 14.794 (17.113)
172
+ 2025-05-06,22:01:45 | WARNING | Handling webdataset error (OSError('image file is truncated (25 bytes not processed)')). Ignoring.
173
+ 2025-05-06,22:11:38 | INFO | Train Epoch: 0 [ 6307840/128008192 (5%)] Data (t): 0.170 Batch (t): 6.402, 2557.76/s, 79.9301/s/gpu LR: 0.000770 Logit Scale: 14.702 Class_loss: 7.3171 (8.4139) Contrastive_loss: 7.1230 (8.0312) Loss: 14.440 (16.445)
174
+ 2025-05-06,22:25:18 | INFO | Train Epoch: 0 [ 8404992/128008192 (7%)] Data (t): 0.172 Batch (t): 6.404, 2561.98/s, 80.0620/s/gpu LR: 0.001000 Logit Scale: 15.531 Class_loss: 7.3381 (8.1987) Contrastive_loss: 6.6865 (7.7623) Loss: 14.025 (15.961)
175
+ 2025-05-06,22:38:59 | INFO | Train Epoch: 0 [ 10502144/128008192 (8%)] Data (t): 0.178 Batch (t): 6.412, 2559.02/s, 79.9695/s/gpu LR: 0.001000 Logit Scale: 17.445 Class_loss: 7.0915 (8.0142) Contrastive_loss: 6.0500 (7.4769) Loss: 13.142 (15.491)
176
+ 2025-05-06,22:43:54 | WARNING | Handling webdataset error (OSError('image file is truncated (104 bytes not processed)')). Ignoring.
177
+ 2025-05-06,22:44:21 | WARNING | Handling webdataset error (OSError('image file is truncated (21 bytes not processed)')). Ignoring.
178
+ 2025-05-06,22:52:39 | INFO | Train Epoch: 0 [ 12599296/128008192 (10%)] Data (t): 0.173 Batch (t): 6.408, 2558.86/s, 79.9644/s/gpu LR: 0.001000 Logit Scale: 18.763 Class_loss: 7.0032 (7.8698) Contrastive_loss: 5.4699 (7.1902) Loss: 12.473 (15.060)
179
+ 2025-05-06,23:06:19 | INFO | Train Epoch: 0 [ 14696448/128008192 (11%)] Data (t): 0.173 Batch (t): 6.409, 2555.35/s, 79.8547/s/gpu LR: 0.001000 Logit Scale: 20.932 Class_loss: 6.8086 (7.7371) Contrastive_loss: 4.8668 (6.8998) Loss: 11.675 (14.637)
180
+ 2025-05-06,23:19:47 | WARNING | Handling webdataset error (OSError('image file is truncated (23 bytes not processed)')). Ignoring.
181
+ 2025-05-06,23:20:00 | INFO | Train Epoch: 0 [ 16793600/128008192 (13%)] Data (t): 0.174 Batch (t): 6.410, 2560.88/s, 80.0276/s/gpu LR: 0.000999 Logit Scale: 23.645 Class_loss: 6.7696 (7.6296) Contrastive_loss: 4.3629 (6.6179) Loss: 11.132 (14.247)
182
+ 2025-05-06,23:33:40 | INFO | Train Epoch: 0 [ 18890752/128008192 (15%)] Data (t): 0.174 Batch (t): 6.411, 2556.88/s, 79.9025/s/gpu LR: 0.000999 Logit Scale: 26.511 Class_loss: 6.5251 (7.5192) Contrastive_loss: 3.6718 (6.3233) Loss: 10.197 (13.842)
183
+ 2025-05-06,23:47:21 | INFO | Train Epoch: 0 [ 20987904/128008192 (16%)] Data (t): 0.174 Batch (t): 6.410, 2558.77/s, 79.9614/s/gpu LR: 0.000998 Logit Scale: 28.397 Class_loss: 6.5955 (7.4352) Contrastive_loss: 3.4982 (6.0664) Loss: 10.094 (13.502)
184
+ 2025-05-07,00:01:02 | INFO | Train Epoch: 0 [ 23085056/128008192 (18%)] Data (t): 0.174 Batch (t): 6.415, 2550.42/s, 79.7006/s/gpu LR: 0.000998 Logit Scale: 31.633 Class_loss: 6.5085 (7.3580) Contrastive_loss: 2.8319 (5.7969) Loss: 9.3405 (13.155)
185
+ 2025-05-07,00:14:43 | INFO | Train Epoch: 0 [ 25182208/128008192 (20%)] Data (t): 0.176 Batch (t): 6.414, 2545.04/s, 79.5326/s/gpu LR: 0.000997 Logit Scale: 33.228 Class_loss: 6.4519 (7.2883) Contrastive_loss: 3.2656 (5.6022) Loss: 9.7175 (12.890)
186
+ 2025-05-07,00:28:24 | INFO | Train Epoch: 0 [ 27279360/128008192 (21%)] Data (t): 0.175 Batch (t): 6.413, 2554.77/s, 79.8367/s/gpu LR: 0.000996 Logit Scale: 36.421 Class_loss: 6.2528 (7.2143) Contrastive_loss: 2.2822 (5.3650) Loss: 8.5350 (12.579)
187
+ 2025-05-07,00:42:05 | INFO | Train Epoch: 0 [ 29376512/128008192 (23%)] Data (t): 0.175 Batch (t): 6.417, 2559.14/s, 79.9732/s/gpu LR: 0.000996 Logit Scale: 39.739 Class_loss: 6.1449 (7.1430) Contrastive_loss: 1.9924 (5.1402) Loss: 8.1373 (12.283)
188
+ 2025-05-07,00:46:54 | WARNING | Handling webdataset error (OSError('image file is truncated (1 bytes not processed)')). Ignoring.
189
+ 2025-05-07,00:55:48 | INFO | Train Epoch: 0 [ 31473664/128008192 (25%)] Data (t): 0.174 Batch (t): 6.428, 2548.37/s, 79.6365/s/gpu LR: 0.000995 Logit Scale: 42.149 Class_loss: 6.3067 (7.0907) Contrastive_loss: 2.0258 (4.9455) Loss: 8.3325 (12.036)
190
+ 2025-05-07,01:09:31 | INFO | Train Epoch: 0 [ 33570816/128008192 (26%)] Data (t): 0.171 Batch (t): 6.427, 2552.84/s, 79.7761/s/gpu LR: 0.000994 Logit Scale: 45.225 Class_loss: 6.0822 (7.0314) Contrastive_loss: 2.3211 (4.7912) Loss: 8.4032 (11.823)
191
+ 2025-05-07,01:17:41 | WARNING | Handling webdataset error (OSError('image file is truncated (0 bytes not processed)')). Ignoring.
192
+ 2025-05-07,01:23:13 | INFO | Train Epoch: 0 [ 35667968/128008192 (28%)] Data (t): 0.171 Batch (t): 6.426, 2552.57/s, 79.7679/s/gpu LR: 0.000993 Logit Scale: 45.897 Class_loss: 6.0675 (6.9779) Contrastive_loss: 1.6202 (4.6150) Loss: 7.6877 (11.593)
193
+ 2025-05-07,01:36:54 | INFO | Train Epoch: 0 [ 37765120/128008192 (30%)] Data (t): 0.171 Batch (t): 6.415, 2557.00/s, 79.9062/s/gpu LR: 0.000992 Logit Scale: 47.862 Class_loss: 5.9346 (6.9230) Contrastive_loss: 1.5128 (4.4517) Loss: 7.4474 (11.375)
194
+ 2025-05-07,01:38:31 | WARNING | Handling webdataset error (OSError('image file is truncated (18 bytes not processed)')). Ignoring.
195
+ 2025-05-07,01:50:37 | INFO | Train Epoch: 0 [ 39862272/128008192 (31%)] Data (t): 0.170 Batch (t): 6.426, 2531.06/s, 79.0957/s/gpu LR: 0.000990 Logit Scale: 49.578 Class_loss: 6.0066 (6.8771) Contrastive_loss: 1.5477 (4.3065) Loss: 7.5543 (11.184)
196
+ 2025-05-07,02:04:25 | INFO | Train Epoch: 0 [ 41959424/128008192 (33%)] Data (t): 0.170 Batch (t): 6.467, 2534.72/s, 79.2101/s/gpu LR: 0.000989 Logit Scale: 51.565 Class_loss: 5.8907 (6.8302) Contrastive_loss: 1.5005 (4.1729) Loss: 7.3912 (11.003)
197
+ 2025-05-07,02:18:12 | INFO | Train Epoch: 0 [ 44056576/128008192 (34%)] Data (t): 0.168 Batch (t): 6.463, 2536.16/s, 79.2551/s/gpu LR: 0.000988 Logit Scale: 52.893 Class_loss: 5.8330 (6.7848) Contrastive_loss: 1.2683 (4.0409) Loss: 7.1013 (10.826)
198
+ 2025-05-07,02:32:01 | INFO | Train Epoch: 0 [ 46153728/128008192 (36%)] Data (t): 0.175 Batch (t): 6.475, 2523.44/s, 78.8575/s/gpu LR: 0.000986 Logit Scale: 53.981 Class_loss: 5.8545 (6.7444) Contrastive_loss: 1.2319 (3.9188) Loss: 7.0864 (10.663)
199
+ 2025-05-07,02:45:49 | INFO | Train Epoch: 0 [ 48250880/128008192 (38%)] Data (t): 0.170 Batch (t): 6.470, 2537.43/s, 79.2948/s/gpu LR: 0.000984 Logit Scale: 55.158 Class_loss: 5.8767 (6.7082) Contrastive_loss: 1.5127 (3.8185) Loss: 7.3894 (10.527)
200
+ 2025-05-07,02:59:34 | INFO | Train Epoch: 0 [ 50348032/128008192 (39%)] Data (t): 0.169 Batch (t): 6.445, 2553.30/s, 79.7905/s/gpu LR: 0.000983 Logit Scale: 56.333 Class_loss: 5.8127 (6.6724) Contrastive_loss: 1.3460 (3.7196) Loss: 7.1587 (10.392)
201
+ 2025-05-07,03:13:15 | INFO | Train Epoch: 0 [ 52445184/128008192 (41%)] Data (t): 0.171 Batch (t): 6.412, 2558.80/s, 79.9624/s/gpu LR: 0.000981 Logit Scale: 57.331 Class_loss: 5.8839 (6.6421) Contrastive_loss: 1.1207 (3.6196) Loss: 7.0046 (10.262)
202
+ 2025-05-07,03:26:55 | INFO | Train Epoch: 0 [ 54542336/128008192 (43%)] Data (t): 0.173 Batch (t): 6.414, 2548.40/s, 79.6375/s/gpu LR: 0.000979 Logit Scale: 58.344 Class_loss: 5.7300 (6.6083) Contrastive_loss: 1.1547 (3.5283) Loss: 6.8847 (10.137)
203
+ 2025-05-07,03:40:36 | INFO | Train Epoch: 0 [ 56639488/128008192 (44%)] Data (t): 0.175 Batch (t): 6.413, 2549.15/s, 79.6608/s/gpu LR: 0.000977 Logit Scale: 59.108 Class_loss: 5.8010 (6.5795) Contrastive_loss: 0.88206 (3.4338) Loss: 6.6831 (10.013)
204
+ 2025-05-07,03:54:19 | INFO | Train Epoch: 0 [ 58736640/128008192 (46%)] Data (t): 0.176 Batch (t): 6.424, 2556.82/s, 79.9005/s/gpu LR: 0.000975 Logit Scale: 59.792 Class_loss: 5.6812 (6.5485) Contrastive_loss: 1.0938 (3.3531) Loss: 6.7750 (9.9016)
205
+ 2025-05-07,03:57:38 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
206
+ 2025-05-07,04:08:03 | INFO | Train Epoch: 0 [ 60833792/128008192 (48%)] Data (t): 0.171 Batch (t): 6.438, 2540.88/s, 79.4026/s/gpu LR: 0.000973 Logit Scale: 60.547 Class_loss: 5.6634 (6.5190) Contrastive_loss: 1.0677 (3.2770) Loss: 6.7311 (9.7960)
207
+ 2025-05-07,04:21:48 | INFO | Train Epoch: 0 [ 62930944/128008192 (49%)] Data (t): 0.175 Batch (t): 6.450, 2531.63/s, 79.1136/s/gpu LR: 0.000971 Logit Scale: 61.238 Class_loss: 5.6795 (6.4919) Contrastive_loss: 0.97064 (3.2026) Loss: 6.6502 (9.6945)
208
+ 2025-05-07,04:35:36 | INFO | Train Epoch: 0 [ 65028096/128008192 (51%)] Data (t): 0.171 Batch (t): 6.466, 2537.37/s, 79.2927/s/gpu LR: 0.000969 Logit Scale: 61.741 Class_loss: 5.5270 (6.4618) Contrastive_loss: 1.1925 (3.1398) Loss: 6.7194 (9.6015)
209
+ 2025-05-07,04:49:23 | INFO | Train Epoch: 0 [ 67125248/128008192 (52%)] Data (t): 0.171 Batch (t): 6.464, 2530.83/s, 79.0885/s/gpu LR: 0.000967 Logit Scale: 62.384 Class_loss: 5.6778 (6.4380) Contrastive_loss: 1.0487 (3.0764) Loss: 6.7265 (9.5144)
210
+ 2025-05-07,05:00:36 | WARNING | Handling webdataset error (OSError('image file is truncated (46 bytes not processed)')). Ignoring.
211
+ 2025-05-07,05:03:10 | INFO | Train Epoch: 0 [ 69222400/128008192 (54%)] Data (t): 0.171 Batch (t): 6.460, 2529.57/s, 79.0492/s/gpu LR: 0.000964 Logit Scale: 63.007 Class_loss: 5.6411 (6.4146) Contrastive_loss: 1.0051 (3.0155) Loss: 6.6462 (9.4300)
212
+ 2025-05-07,05:12:41 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
213
+ 2025-05-07,05:16:59 | INFO | Train Epoch: 0 [ 71319552/128008192 (56%)] Data (t): 0.171 Batch (t): 6.471, 2549.71/s, 79.6783/s/gpu LR: 0.000962 Logit Scale: 63.601 Class_loss: 5.5717 (6.3905) Contrastive_loss: 0.94291 (2.9563) Loss: 6.5146 (9.3467)
214
+ 2025-05-07,05:30:43 | INFO | Train Epoch: 0 [ 73416704/128008192 (57%)] Data (t): 0.170 Batch (t): 6.445, 2529.93/s, 79.0605/s/gpu LR: 0.000959 Logit Scale: 64.087 Class_loss: 5.5575 (6.3673) Contrastive_loss: 0.90484 (2.8993) Loss: 6.4623 (9.2666)
215
+ 2025-05-07,05:44:35 | INFO | Train Epoch: 0 [ 75513856/128008192 (59%)] Data (t): 0.171 Batch (t): 6.496, 2518.28/s, 78.6962/s/gpu LR: 0.000957 Logit Scale: 64.703 Class_loss: 5.5236 (6.3445) Contrastive_loss: 0.98342 (2.8475) Loss: 6.5071 (9.1920)
216
+ 2025-05-07,05:56:16 | WARNING | Handling webdataset error (OSError('image file is truncated (64 bytes not processed)')). Ignoring.
217
+ 2025-05-07,05:58:24 | INFO | Train Epoch: 0 [ 77611008/128008192 (61%)] Data (t): 0.171 Batch (t): 6.474, 2535.13/s, 79.2228/s/gpu LR: 0.000954 Logit Scale: 65.116 Class_loss: 5.5745 (6.3243) Contrastive_loss: 0.86935 (2.7954) Loss: 6.4439 (9.1197)
218
+ 2025-05-07,06:12:11 | INFO | Train Epoch: 0 [ 79708160/128008192 (62%)] Data (t): 0.171 Batch (t): 6.465, 2542.96/s, 79.4675/s/gpu LR: 0.000951 Logit Scale: 65.517 Class_loss: 5.5001 (6.3031) Contrastive_loss: 0.99385 (2.7492) Loss: 6.4939 (9.0524)
219
+ 2025-05-07,06:25:56 | INFO | Train Epoch: 0 [ 81805312/128008192 (64%)] Data (t): 0.170 Batch (t): 6.446, 2553.37/s, 79.7929/s/gpu LR: 0.000948 Logit Scale: 65.960 Class_loss: 5.4410 (6.2816) Contrastive_loss: 1.1386 (2.7090) Loss: 6.5796 (8.9906)
220
+ 2025-05-07,06:39:40 | INFO | Train Epoch: 0 [ 83902464/128008192 (66%)] Data (t): 0.171 Batch (t): 6.432, 2541.27/s, 79.4148/s/gpu LR: 0.000945 Logit Scale: 66.229 Class_loss: 5.6173 (6.2654) Contrastive_loss: 0.92943 (2.6656) Loss: 6.5467 (8.9310)
221
+ 2025-05-07,06:53:23 | INFO | Train Epoch: 0 [ 85999616/128008192 (67%)] Data (t): 0.170 Batch (t): 6.433, 2553.92/s, 79.8101/s/gpu LR: 0.000942 Logit Scale: 66.642 Class_loss: 5.4679 (6.2464) Contrastive_loss: 0.93466 (2.6244) Loss: 6.4025 (8.8708)
222
+ 2025-05-07,07:07:06 | INFO | Train Epoch: 0 [ 88096768/128008192 (69%)] Data (t): 0.171 Batch (t): 6.430, 2547.34/s, 79.6044/s/gpu LR: 0.000939 Logit Scale: 66.968 Class_loss: 5.5053 (6.2292) Contrastive_loss: 0.85269 (2.5832) Loss: 6.3580 (8.8123)
223
+ 2025-05-07,07:20:48 | INFO | Train Epoch: 0 [ 90193920/128008192 (70%)] Data (t): 0.172 Batch (t): 6.426, 2549.25/s, 79.6640/s/gpu LR: 0.000936 Logit Scale: 67.297 Class_loss: 5.5426 (6.2136) Contrastive_loss: 0.86290 (2.5441) Loss: 6.4055 (8.7576)
224
+ 2025-05-07,07:34:30 | INFO | Train Epoch: 0 [ 92291072/128008192 (72%)] Data (t): 0.170 Batch (t): 6.415, 2547.37/s, 79.6054/s/gpu LR: 0.000933 Logit Scale: 67.644 Class_loss: 5.5627 (6.1991) Contrastive_loss: 0.96505 (2.5090) Loss: 6.5278 (8.7081)
225
+ 2025-05-07,07:48:13 | INFO | Train Epoch: 0 [ 94388224/128008192 (74%)] Data (t): 0.169 Batch (t): 6.432, 2553.87/s, 79.8085/s/gpu LR: 0.000930 Logit Scale: 67.863 Class_loss: 5.5047 (6.1840) Contrastive_loss: 0.79742 (2.4718) Loss: 6.3022 (8.6558)
226
+ 2025-05-07,08:01:59 | INFO | Train Epoch: 0 [ 96485376/128008192 (75%)] Data (t): 0.169 Batch (t): 6.453, 2557.12/s, 79.9100/s/gpu LR: 0.000926 Logit Scale: 68.319 Class_loss: 5.4690 (6.1688) Contrastive_loss: 0.96227 (2.4396) Loss: 6.4313 (8.6084)
227
+ 2025-05-07,08:15:43 | INFO | Train Epoch: 0 [ 98582528/128008192 (77%)] Data (t): 0.170 Batch (t): 6.442, 2550.43/s, 79.7010/s/gpu LR: 0.000923 Logit Scale: 68.614 Class_loss: 5.4740 (6.1543) Contrastive_loss: 0.75768 (2.4046) Loss: 6.2317 (8.5589)
228
+ 2025-05-07,08:29:41 | INFO | Train Epoch: 0 [100679680/128008192 (79%)] Data (t): 0.276 Batch (t): 6.546, 2551.26/s, 79.7270/s/gpu LR: 0.000919 Logit Scale: 69.045 Class_loss: 5.4281 (6.1395) Contrastive_loss: 0.96915 (2.3753) Loss: 6.3972 (8.5148)
229
+ 2025-05-07,08:43:24 | INFO | Train Epoch: 0 [102776832/128008192 (80%)] Data (t): 0.172 Batch (t): 6.425, 2556.52/s, 79.8913/s/gpu LR: 0.000916 Logit Scale: 69.405 Class_loss: 5.4282 (6.1253) Contrastive_loss: 0.90899 (2.3460) Loss: 6.3372 (8.4712)
230
+ 2025-05-07,08:57:05 | INFO | Train Epoch: 0 [104873984/128008192 (82%)] Data (t): 0.170 Batch (t): 6.419, 2549.98/s, 79.6869/s/gpu LR: 0.000912 Logit Scale: 69.788 Class_loss: 5.4340 (6.1117) Contrastive_loss: 0.90198 (2.3177) Loss: 6.3360 (8.4294)
231
+ 2025-05-07,09:10:51 | INFO | Train Epoch: 0 [106971136/128008192 (84%)] Data (t): 0.170 Batch (t): 6.448, 2558.03/s, 79.9384/s/gpu LR: 0.000908 Logit Scale: 69.426 Class_loss: 5.3624 (6.0973) Contrastive_loss: 1.1085 (2.2944) Loss: 6.4709 (8.3917)
232
+ 2025-05-07,09:24:34 | INFO | Train Epoch: 0 [109068288/128008192 (85%)] Data (t): 0.170 Batch (t): 6.430, 2559.01/s, 79.9689/s/gpu LR: 0.000904 Logit Scale: 70.077 Class_loss: 5.4336 (6.0848) Contrastive_loss: 0.79758 (2.2662) Loss: 6.2312 (8.3510)
233
+ 2025-05-07,09:30:16 | WARNING | Handling webdataset error (OSError('image file is truncated (59 bytes not processed)')). Ignoring.
234
+ 2025-05-07,09:38:20 | INFO | Train Epoch: 0 [111165440/128008192 (87%)] Data (t): 0.171 Batch (t): 6.459, 2532.59/s, 79.1434/s/gpu LR: 0.000900 Logit Scale: 70.358 Class_loss: 5.5177 (6.0743) Contrastive_loss: 0.72663 (2.2377) Loss: 6.2443 (8.3119)
235
+ 2025-05-07,09:52:09 | INFO | Train Epoch: 0 [113262592/128008192 (88%)] Data (t): 0.169 Batch (t): 6.476, 2533.48/s, 79.1713/s/gpu LR: 0.000897 Logit Scale: 70.749 Class_loss: 5.5183 (6.0642) Contrastive_loss: 0.79213 (2.2114) Loss: 6.3104 (8.2755)
236
+ 2025-05-07,10:05:54 | INFO | Train Epoch: 0 [115359744/128008192 (90%)] Data (t): 0.169 Batch (t): 6.445, 2553.13/s, 79.7852/s/gpu LR: 0.000892 Logit Scale: 70.885 Class_loss: 5.3711 (6.0518) Contrastive_loss: 0.78592 (2.1859) Loss: 6.1570 (8.2377)
237
+ 2025-05-07,10:19:36 | INFO | Train Epoch: 0 [117456896/128008192 (92%)] Data (t): 0.170 Batch (t): 6.417, 2560.39/s, 80.0121/s/gpu LR: 0.000888 Logit Scale: 71.062 Class_loss: 5.4419 (6.0411) Contrastive_loss: 0.80356 (2.1617) Loss: 6.2455 (8.2028)
238
+ 2025-05-07,10:33:18 | INFO | Train Epoch: 0 [119554048/128008192 (93%)] Data (t): 0.171 Batch (t): 6.425, 2538.08/s, 79.3150/s/gpu LR: 0.000884 Logit Scale: 71.253 Class_loss: 5.3982 (6.0300) Contrastive_loss: 0.92355 (2.1403) Loss: 6.3218 (8.1703)
239
+ 2025-05-07,10:42:15 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
240
+ 2025-05-07,10:47:05 | INFO | Train Epoch: 0 [121651200/128008192 (95%)] Data (t): 0.171 Batch (t): 6.458, 2537.06/s, 79.2830/s/gpu LR: 0.000880 Logit Scale: 71.296 Class_loss: 5.4093 (6.0195) Contrastive_loss: 0.79288 (2.1175) Loss: 6.2022 (8.1370)
241
+ 2025-05-07,10:52:03 | WARNING | Handling webdataset error (OSError('image file is truncated (37 bytes not processed)')). Ignoring.
242
+ 2025-05-07,10:53:21 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
243
+ 2025-05-07,11:00:49 | INFO | Train Epoch: 0 [123748352/128008192 (97%)] Data (t): 0.171 Batch (t): 6.443, 2555.52/s, 79.8602/s/gpu LR: 0.000876 Logit Scale: 71.658 Class_loss: 5.4100 (6.0093) Contrastive_loss: 0.94656 (2.0980) Loss: 6.3565 (8.1073)
244
+ 2025-05-07,11:01:28 | WARNING | Handling webdataset error (OSError('image file is truncated (88 bytes not processed)')). Ignoring.
245
+ 2025-05-07,11:14:30 | INFO | Train Epoch: 0 [125845504/128008192 (98%)] Data (t): 0.171 Batch (t): 6.409, 2552.86/s, 79.7769/s/gpu LR: 0.000871 Logit Scale: 71.968 Class_loss: 5.3976 (5.9993) Contrastive_loss: 0.70334 (2.0751) Loss: 6.1010 (8.0744)
246
+ 2025-05-07,11:28:10 | INFO | Train Epoch: 0 [127942656/128008192 (100%)] Data (t): 0.170 Batch (t): 6.412, 2556.95/s, 79.9046/s/gpu LR: 0.000867 Logit Scale: 72.110 Class_loss: 5.3941 (5.9895) Contrastive_loss: 0.70302 (2.0530) Loss: 6.0971 (8.0425)
247
+ 2025-05-07,11:28:36 | INFO | Train Epoch: 0 [128008192/128008192 (100%)] Data (t): 0.176 Batch (t): 6.404, 2562.83/s, 80.0885/s/gpu LR: 0.000867 Logit Scale: 72.112 Class_loss: 5.4650 (5.9812) Contrastive_loss: 0.74691 (2.0322) Loss: 6.2119 (8.0135)
248
+ 2025-05-07,11:28:53 | INFO | Start epoch 1
249
+ 2025-05-07,11:29:04 | INFO | Train Epoch: 1 [ 16384/128008192 (0%)] Data (t): 4.182 Batch (t): 10.377, 1578.81/s, 49.3379/s/gpu LR: 0.000867 Logit Scale: 72.117 Class_loss: 5.4527 (5.4527) Contrastive_loss: 0.70227 (0.70227) Loss: 6.1549 (6.1549)
250
+ 2025-05-07,11:42:50 | INFO | Train Epoch: 1 [ 2113536/128008192 (2%)] Data (t): 0.172 Batch (t): 6.451, 2536.20/s, 79.2563/s/gpu LR: 0.000862 Logit Scale: 72.464 Class_loss: 5.3906 (5.4217) Contrastive_loss: 0.76697 (0.73462) Loss: 6.1576 (6.1563)
251
+ 2025-05-07,11:46:25 | WARNING | Handling webdataset error (OSError('image file is truncated (53 bytes not processed)')). Ignoring.
252
+ 2025-05-07,11:56:38 | INFO | Train Epoch: 1 [ 4210688/128008192 (3%)] Data (t): 0.171 Batch (t): 6.474, 2536.43/s, 79.2635/s/gpu LR: 0.000858 Logit Scale: 72.528 Class_loss: 5.4592 (5.4342) Contrastive_loss: 0.79644 (0.75522) Loss: 6.2557 (6.1894)
253
+ 2025-05-07,12:10:28 | INFO | Train Epoch: 1 [ 6307840/128008192 (5%)] Data (t): 0.171 Batch (t): 6.481, 2521.07/s, 78.7834/s/gpu LR: 0.000853 Logit Scale: 72.804 Class_loss: 5.4305 (5.4333) Contrastive_loss: 0.69016 (0.73896) Loss: 6.1206 (6.1722)
254
+ 2025-05-07,12:24:15 | INFO | Train Epoch: 1 [ 8404992/128008192 (7%)] Data (t): 0.171 Batch (t): 6.458, 2561.33/s, 80.0416/s/gpu LR: 0.000849 Logit Scale: 73.023 Class_loss: 5.4240 (5.4314) Contrastive_loss: 0.84396 (0.75996) Loss: 6.2680 (6.1914)
255
+ 2025-05-07,12:38:00 | INFO | Train Epoch: 1 [ 10502144/128008192 (8%)] Data (t): 0.169 Batch (t): 6.447, 2551.70/s, 79.7408/s/gpu LR: 0.000844 Logit Scale: 73.300 Class_loss: 5.4526 (5.4349) Contrastive_loss: 0.80875 (0.76809) Loss: 6.2613 (6.2030)
256
+ 2025-05-07,12:51:41 | INFO | Train Epoch: 1 [ 12599296/128008192 (10%)] Data (t): 0.171 Batch (t): 6.417, 2532.37/s, 79.1367/s/gpu LR: 0.000839 Logit Scale: 73.450 Class_loss: 5.3449 (5.4221) Contrastive_loss: 0.73168 (0.76289) Loss: 6.0766 (6.1850)
257
+ 2025-05-07,13:05:32 | INFO | Train Epoch: 1 [ 14696448/128008192 (11%)] Data (t): 0.171 Batch (t): 6.492, 2520.19/s, 78.7559/s/gpu LR: 0.000834 Logit Scale: 73.660 Class_loss: 5.3311 (5.4107) Contrastive_loss: 0.80633 (0.76832) Loss: 6.1374 (6.1790)
258
+ 2025-05-07,13:19:23 | INFO | Train Epoch: 1 [ 16793600/128008192 (13%)] Data (t): 0.170 Batch (t): 6.493, 2520.85/s, 78.7765/s/gpu LR: 0.000829 Logit Scale: 73.821 Class_loss: 5.3311 (5.4019) Contrastive_loss: 0.74357 (0.76557) Loss: 6.0746 (6.1674)
259
+ 2025-05-07,13:33:14 | INFO | Train Epoch: 1 [ 18890752/128008192 (15%)] Data (t): 0.171 Batch (t): 6.494, 2520.16/s, 78.7549/s/gpu LR: 0.000824 Logit Scale: 74.055 Class_loss: 5.3733 (5.3990) Contrastive_loss: 0.90668 (0.77968) Loss: 6.2800 (6.1787)
260
+ 2025-05-07,13:47:06 | INFO | Train Epoch: 1 [ 20987904/128008192 (16%)] Data (t): 0.171 Batch (t): 6.500, 2519.04/s, 78.7199/s/gpu LR: 0.000819 Logit Scale: 74.242 Class_loss: 5.4243 (5.4013) Contrastive_loss: 0.80682 (0.78215) Loss: 6.2311 (6.1834)
261
+ 2025-05-07,13:52:13 | WARNING | Handling webdataset error (OSError('image file is truncated (32 bytes not processed)')). Ignoring.
262
+ 2025-05-07,14:00:58 | INFO | Train Epoch: 1 [ 23085056/128008192 (18%)] Data (t): 0.171 Batch (t): 6.500, 2526.02/s, 78.9380/s/gpu LR: 0.000814 Logit Scale: 74.380 Class_loss: 5.3023 (5.3930) Contrastive_loss: 0.77919 (0.78190) Loss: 6.0814 (6.1749)
263
+ 2025-05-07,14:14:50 | INFO | Train Epoch: 1 [ 25182208/128008192 (20%)] Data (t): 0.172 Batch (t): 6.494, 2521.62/s, 78.8006/s/gpu LR: 0.000809 Logit Scale: 74.448 Class_loss: 5.2319 (5.3807) Contrastive_loss: 0.86989 (0.78867) Loss: 6.1018 (6.1693)
264
+ 2025-05-07,14:25:21 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
265
+ 2025-05-07,14:28:09 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
266
+ 2025-05-07,14:28:41 | INFO | Train Epoch: 1 [ 27279360/128008192 (21%)] Data (t): 0.172 Batch (t): 6.497, 2524.63/s, 78.8947/s/gpu LR: 0.000804 Logit Scale: 74.595 Class_loss: 5.3581 (5.3790) Contrastive_loss: 0.80242 (0.78965) Loss: 6.1606 (6.1687)
267
+ 2025-05-07,14:33:54 | WARNING | Handling webdataset error (OSError('image file is truncated (131 bytes not processed)')). Ignoring.
268
+ 2025-05-07,14:42:33 | INFO | Train Epoch: 1 [ 29376512/128008192 (23%)] Data (t): 0.172 Batch (t): 6.497, 2520.54/s, 78.7669/s/gpu LR: 0.000799 Logit Scale: 74.898 Class_loss: 5.3345 (5.3761) Contrastive_loss: 0.75083 (0.78706) Loss: 6.0854 (6.1631)
269
+ 2025-05-07,14:44:38 | WARNING | Handling webdataset error (OSError('image file is truncated (15 bytes not processed)')). Ignoring.
270
+ 2025-05-07,14:56:25 | INFO | Train Epoch: 1 [ 31473664/128008192 (25%)] Data (t): 0.172 Batch (t): 6.504, 2524.33/s, 78.8853/s/gpu LR: 0.000794 Logit Scale: 74.884 Class_loss: 5.2639 (5.3691) Contrastive_loss: 0.66739 (0.77958) Loss: 5.9313 (6.1487)
271
+ 2025-05-07,15:10:18 | INFO | Train Epoch: 1 [ 33570816/128008192 (26%)] Data (t): 0.171 Batch (t): 6.506, 2518.80/s, 78.7126/s/gpu LR: 0.000788 Logit Scale: 75.170 Class_loss: 5.3147 (5.3659) Contrastive_loss: 0.74993 (0.77784) Loss: 6.0646 (6.1437)
272
+ 2025-05-07,15:24:11 | INFO | Train Epoch: 1 [ 35667968/128008192 (28%)] Data (t): 0.172 Batch (t): 6.504, 2517.42/s, 78.6694/s/gpu LR: 0.000783 Logit Scale: 75.264 Class_loss: 5.2810 (5.3612) Contrastive_loss: 0.82874 (0.78067) Loss: 6.1097 (6.1418)
273
+ 2025-05-07,15:36:16 | WARNING | Handling webdataset error (OSError('image file is truncated (186 bytes not processed)')). Ignoring.
274
+ 2025-05-07,15:37:58 | INFO | Train Epoch: 1 [ 37765120/128008192 (30%)] Data (t): 0.172 Batch (t): 6.466, 2555.61/s, 79.8627/s/gpu LR: 0.000777 Logit Scale: 75.417 Class_loss: 5.3711 (5.3617) Contrastive_loss: 0.70644 (0.77676) Loss: 6.0775 (6.1384)
275
+ 2025-05-07,15:51:38 | INFO | Train Epoch: 1 [ 39862272/128008192 (31%)] Data (t): 0.172 Batch (t): 6.403, 2562.03/s, 80.0636/s/gpu LR: 0.000772 Logit Scale: 75.603 Class_loss: 5.2983 (5.3585) Contrastive_loss: 0.81128 (0.77849) Loss: 6.1096 (6.1370)
276
+ 2025-05-07,16:05:18 | INFO | Train Epoch: 1 [ 41959424/128008192 (33%)] Data (t): 0.173 Batch (t): 6.403, 2555.07/s, 79.8460/s/gpu LR: 0.000767 Logit Scale: 75.631 Class_loss: 5.3232 (5.3568) Contrastive_loss: 0.71996 (0.77570) Loss: 6.0432 (6.1325)
277
+ 2025-05-07,16:18:59 | INFO | Train Epoch: 1 [ 44056576/128008192 (34%)] Data (t): 0.171 Batch (t): 6.419, 2537.69/s, 79.3028/s/gpu LR: 0.000761 Logit Scale: 75.851 Class_loss: 5.2917 (5.3539) Contrastive_loss: 0.58290 (0.76694) Loss: 5.8746 (6.1208)
278
+ 2025-05-07,16:32:47 | INFO | Train Epoch: 1 [ 46153728/128008192 (36%)] Data (t): 0.171 Batch (t): 6.468, 2525.02/s, 78.9070/s/gpu LR: 0.000755 Logit Scale: 75.990 Class_loss: 5.2756 (5.3505) Contrastive_loss: 0.63273 (0.76110) Loss: 5.9084 (6.1116)
279
+ 2025-05-07,16:46:35 | INFO | Train Epoch: 1 [ 48250880/128008192 (38%)] Data (t): 0.172 Batch (t): 6.463, 2536.86/s, 79.2768/s/gpu LR: 0.000750 Logit Scale: 76.267 Class_loss: 5.2910 (5.3480) Contrastive_loss: 0.74985 (0.76063) Loss: 6.0408 (6.1086)
280
+ 2025-05-07,17:00:22 | INFO | Train Epoch: 1 [ 50348032/128008192 (39%)] Data (t): 0.172 Batch (t): 6.461, 2532.84/s, 79.1514/s/gpu LR: 0.000744 Logit Scale: 76.314 Class_loss: 5.2338 (5.3434) Contrastive_loss: 0.77127 (0.76106) Loss: 6.0051 (6.1045)
281
+ 2025-05-07,17:03:44 | WARNING | Handling webdataset error (OSError('image file is truncated (5 bytes not processed)')). Ignoring.
282
+ 2025-05-07,17:06:11 | WARNING | Handling webdataset error (OSError('image file is truncated (14 bytes not processed)')). Ignoring.
283
+ 2025-05-07,17:14:09 | INFO | Train Epoch: 1 [ 52445184/128008192 (41%)] Data (t): 0.171 Batch (t): 6.461, 2538.59/s, 79.3310/s/gpu LR: 0.000738 Logit Scale: 76.456 Class_loss: 5.2442 (5.3396) Contrastive_loss: 0.75734 (0.76091) Loss: 6.0015 (6.1005)
284
+ 2025-05-07,17:27:55 | INFO | Train Epoch: 1 [ 54542336/128008192 (43%)] Data (t): 0.171 Batch (t): 6.457, 2540.21/s, 79.3814/s/gpu LR: 0.000733 Logit Scale: 76.654 Class_loss: 5.1882 (5.3340) Contrastive_loss: 0.77816 (0.76155) Loss: 5.9664 (6.0955)
285
+ 2025-05-07,17:41:42 | INFO | Train Epoch: 1 [ 56639488/128008192 (44%)] Data (t): 0.171 Batch (t): 6.461, 2536.90/s, 79.2780/s/gpu LR: 0.000727 Logit Scale: 76.813 Class_loss: 5.2369 (5.3305) Contrastive_loss: 0.88486 (0.76596) Loss: 6.1217 (6.0965)
286
+ 2025-05-07,17:46:21 | WARNING | Handling webdataset error (OSError('image file is truncated (1 bytes not processed)')). Ignoring.
287
+ 2025-05-07,17:55:29 | INFO | Train Epoch: 1 [ 58736640/128008192 (46%)] Data (t): 0.171 Batch (t): 6.463, 2526.74/s, 78.9605/s/gpu LR: 0.000721 Logit Scale: 76.894 Class_loss: 5.2474 (5.3277) Contrastive_loss: 0.54476 (0.75833) Loss: 5.7922 (6.0860)
288
+ 2025-05-07,17:59:17 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
289
+ 2025-05-07,18:09:19 | INFO | Train Epoch: 1 [ 60833792/128008192 (48%)] Data (t): 0.171 Batch (t): 6.480, 2537.76/s, 79.3049/s/gpu LR: 0.000715 Logit Scale: 76.924 Class_loss: 5.1746 (5.3226) Contrastive_loss: 0.84639 (0.76127) Loss: 6.0209 (6.0838)
290
+ 2025-05-07,18:23:06 | INFO | Train Epoch: 1 [ 62930944/128008192 (49%)] Data (t): 0.172 Batch (t): 6.463, 2532.90/s, 79.1531/s/gpu LR: 0.000709 Logit Scale: 77.108 Class_loss: 5.1803 (5.3180) Contrastive_loss: 0.84714 (0.76404) Loss: 6.0274 (6.0820)
291
+ 2025-05-07,18:33:54 | WARNING | Handling webdataset error (OSError('image file is truncated (28 bytes not processed)')). Ignoring.
292
+ 2025-05-07,18:36:53 | INFO | Train Epoch: 1 [ 65028096/128008192 (51%)] Data (t): 0.172 Batch (t): 6.462, 2538.71/s, 79.3348/s/gpu LR: 0.000703 Logit Scale: 77.309 Class_loss: 5.2403 (5.3155) Contrastive_loss: 0.73576 (0.76315) Loss: 5.9760 (6.0787)
293
+ 2025-05-07,18:43:08 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
294
+ 2025-05-07,18:50:35 | INFO | Train Epoch: 1 [ 67125248/128008192 (52%)] Data (t): 0.172 Batch (t): 6.423, 2560.38/s, 80.0119/s/gpu LR: 0.000697 Logit Scale: 77.440 Class_loss: 5.1126 (5.3094) Contrastive_loss: 0.76704 (0.76327) Loss: 5.8796 (6.0727)
295
+ 2025-05-07,19:04:19 | INFO | Train Epoch: 1 [ 69222400/128008192 (54%)] Data (t): 0.172 Batch (t): 6.439, 2538.18/s, 79.3182/s/gpu LR: 0.000691 Logit Scale: 77.439 Class_loss: 5.2186 (5.3067) Contrastive_loss: 0.68818 (0.76106) Loss: 5.9068 (6.0678)
296
+ 2025-05-07,19:18:04 | INFO | Train Epoch: 1 [ 71319552/128008192 (56%)] Data (t): 0.172 Batch (t): 6.440, 2539.49/s, 79.3590/s/gpu LR: 0.000685 Logit Scale: 77.644 Class_loss: 5.1885 (5.3033) Contrastive_loss: 0.74999 (0.76074) Loss: 5.9385 (6.0641)
297
+ 2025-05-07,19:32:03 | INFO | Train Epoch: 1 [ 73416704/128008192 (57%)] Data (t): 0.271 Batch (t): 6.557, 2538.14/s, 79.3169/s/gpu LR: 0.000679 Logit Scale: 77.869 Class_loss: 5.3701 (5.3052) Contrastive_loss: 0.84273 (0.76302) Loss: 6.2129 (6.0682)
298
+ 2025-05-07,19:45:50 | INFO | Train Epoch: 1 [ 75513856/128008192 (59%)] Data (t): 0.172 Batch (t): 6.462, 2537.03/s, 79.2821/s/gpu LR: 0.000673 Logit Scale: 77.917 Class_loss: 5.1323 (5.3005) Contrastive_loss: 0.68233 (0.76084) Loss: 5.8146 (6.0614)
299
+ 2025-05-07,19:59:38 | INFO | Train Epoch: 1 [ 77611008/128008192 (61%)] Data (t): 0.172 Batch (t): 6.464, 2536.11/s, 79.2533/s/gpu LR: 0.000667 Logit Scale: 78.050 Class_loss: 5.2440 (5.2990) Contrastive_loss: 0.73132 (0.76006) Loss: 5.9753 (6.0591)
300
+ 2025-05-07,20:13:25 | INFO | Train Epoch: 1 [ 79708160/128008192 (62%)] Data (t): 0.172 Batch (t): 6.462, 2539.36/s, 79.3550/s/gpu LR: 0.000661 Logit Scale: 78.141 Class_loss: 5.2653 (5.2982) Contrastive_loss: 0.68553 (0.75815) Loss: 5.9508 (6.0563)
301
+ 2025-05-07,20:27:12 | INFO | Train Epoch: 1 [ 81805312/128008192 (64%)] Data (t): 0.172 Batch (t): 6.462, 2531.82/s, 79.1195/s/gpu LR: 0.000654 Logit Scale: 78.414 Class_loss: 5.2368 (5.2966) Contrastive_loss: 0.74116 (0.75773) Loss: 5.9780 (6.0544)
302
+ 2025-05-07,20:40:59 | INFO | Train Epoch: 1 [ 83902464/128008192 (66%)] Data (t): 0.173 Batch (t): 6.463, 2539.54/s, 79.3606/s/gpu LR: 0.000648 Logit Scale: 78.498 Class_loss: 5.1707 (5.2936) Contrastive_loss: 0.87737 (0.76065) Loss: 6.0481 (6.0542)
303
+ 2025-05-07,20:54:46 | INFO | Train Epoch: 1 [ 85999616/128008192 (67%)] Data (t): 0.172 Batch (t): 6.461, 2539.19/s, 79.3496/s/gpu LR: 0.000642 Logit Scale: 78.586 Class_loss: 5.2591 (5.2927) Contrastive_loss: 0.65969 (0.75824) Loss: 5.9188 (6.0510)
304
+ 2025-05-07,21:08:33 | INFO | Train Epoch: 1 [ 88096768/128008192 (69%)] Data (t): 0.171 Batch (t): 6.461, 2536.57/s, 79.2678/s/gpu LR: 0.000636 Logit Scale: 78.771 Class_loss: 5.2367 (5.2914) Contrastive_loss: 0.64548 (0.75562) Loss: 5.8822 (6.0471)
305
+ 2025-05-07,21:22:20 | INFO | Train Epoch: 1 [ 90193920/128008192 (70%)] Data (t): 0.170 Batch (t): 6.462, 2537.03/s, 79.2822/s/gpu LR: 0.000629 Logit Scale: 78.941 Class_loss: 5.1770 (5.2888) Contrastive_loss: 0.68262 (0.75396) Loss: 5.8596 (6.0428)
306
+ 2025-05-07,21:36:07 | INFO | Train Epoch: 1 [ 92291072/128008192 (72%)] Data (t): 0.172 Batch (t): 6.463, 2539.13/s, 79.3477/s/gpu LR: 0.000623 Logit Scale: 79.006 Class_loss: 5.1583 (5.2859) Contrastive_loss: 0.77768 (0.75449) Loss: 5.9360 (6.0404)
307
+ 2025-05-07,21:41:52 | WARNING | Handling webdataset error (OSError('image file is truncated (99 bytes not processed)')). Ignoring.
308
+ 2025-05-07,21:44:25 | WARNING | Handling webdataset error (OSError('image file is truncated (45 bytes not processed)')). Ignoring.
309
+ 2025-05-07,21:49:54 | INFO | Train Epoch: 1 [ 94388224/128008192 (74%)] Data (t): 0.172 Batch (t): 6.461, 2537.29/s, 79.2903/s/gpu LR: 0.000617 Logit Scale: 79.085 Class_loss: 5.0787 (5.2814) Contrastive_loss: 0.78018 (0.75505) Loss: 5.8589 (6.0365)
310
+ 2025-05-07,22:03:41 | INFO | Train Epoch: 1 [ 96485376/128008192 (75%)] Data (t): 0.172 Batch (t): 6.461, 2533.64/s, 79.1762/s/gpu LR: 0.000610 Logit Scale: 79.269 Class_loss: 5.1999 (5.2797) Contrastive_loss: 0.65201 (0.75285) Loss: 5.8519 (6.0326)
311
+ 2025-05-07,22:04:40 | WARNING | Handling webdataset error (OSError('image file is truncated (33 bytes not processed)')). Ignoring.
312
+ 2025-05-07,22:17:30 | INFO | Train Epoch: 1 [ 98582528/128008192 (77%)] Data (t): 0.173 Batch (t): 6.471, 2438.86/s, 76.2144/s/gpu LR: 0.000604 Logit Scale: 79.331 Class_loss: 5.0733 (5.2754) Contrastive_loss: 0.82116 (0.75428) Loss: 5.8945 (6.0297)
313
+ 2025-05-07,22:31:18 | INFO | Train Epoch: 1 [100679680/128008192 (79%)] Data (t): 0.172 Batch (t): 6.471, 2537.90/s, 79.3095/s/gpu LR: 0.000597 Logit Scale: 79.439 Class_loss: 5.0821 (5.2715) Contrastive_loss: 0.90169 (0.75729) Loss: 5.9838 (6.0287)
314
+ 2025-05-07,22:40:42 | WARNING | Handling webdataset error (OSError('image file is truncated (108 bytes not processed)')). Ignoring.
315
+ 2025-05-07,22:45:05 | INFO | Train Epoch: 1 [102776832/128008192 (80%)] Data (t): 0.173 Batch (t): 6.463, 2527.35/s, 78.9796/s/gpu LR: 0.000591 Logit Scale: 79.606 Class_loss: 5.1500 (5.2690) Contrastive_loss: 0.79917 (0.75812) Loss: 5.9491 (6.0272)
316
+ 2025-05-07,22:47:03 | WARNING | Handling webdataset error (OSError('image file is truncated (85 bytes not processed)')). Ignoring.
317
+ 2025-05-07,22:54:29 | WARNING | Handling webdataset error (OSError('image file is truncated (25 bytes not processed)')). Ignoring.
318
+ 2025-05-07,22:58:53 | INFO | Train Epoch: 1 [104873984/128008192 (82%)] Data (t): 0.172 Batch (t): 6.464, 2535.10/s, 79.2218/s/gpu LR: 0.000585 Logit Scale: 79.711 Class_loss: 5.1387 (5.2665) Contrastive_loss: 0.68205 (0.75663) Loss: 5.8207 (6.0231)
319
+ 2025-05-07,23:12:40 | INFO | Train Epoch: 1 [106971136/128008192 (84%)] Data (t): 0.172 Batch (t): 6.464, 2534.32/s, 79.1973/s/gpu LR: 0.000578 Logit Scale: 79.906 Class_loss: 5.2474 (5.2661) Contrastive_loss: 0.64959 (0.75457) Loss: 5.8970 (6.0207)
320
+ 2025-05-07,23:24:47 | WARNING | Handling webdataset error (OSError('image file is truncated (67 bytes not processed)')). Ignoring.
321
+ 2025-05-07,23:26:21 | INFO | Train Epoch: 1 [109068288/128008192 (85%)] Data (t): 0.173 Batch (t): 6.415, 2556.89/s, 79.9029/s/gpu LR: 0.000572 Logit Scale: 79.976 Class_loss: 5.1409 (5.2637) Contrastive_loss: 0.71844 (0.75389) Loss: 5.8593 (6.0176)
322
+ 2025-05-07,23:40:02 | INFO | Train Epoch: 1 [111165440/128008192 (87%)] Data (t): 0.172 Batch (t): 6.411, 2548.78/s, 79.6495/s/gpu LR: 0.000565 Logit Scale: 80.203 Class_loss: 5.1224 (5.2611) Contrastive_loss: 0.72358 (0.75333) Loss: 5.8459 (6.0145)
323
+ 2025-05-07,23:53:48 | INFO | Train Epoch: 1 [113262592/128008192 (88%)] Data (t): 0.173 Batch (t): 6.455, 2537.11/s, 79.2847/s/gpu LR: 0.000559 Logit Scale: 80.328 Class_loss: 5.2091 (5.2602) Contrastive_loss: 0.61200 (0.75076) Loss: 5.8211 (6.0109)
324
+ 2025-05-08,00:07:34 | INFO | Train Epoch: 1 [115359744/128008192 (90%)] Data (t): 0.171 Batch (t): 6.456, 2537.23/s, 79.2886/s/gpu LR: 0.000552 Logit Scale: 80.494 Class_loss: 5.1017 (5.2573) Contrastive_loss: 0.72948 (0.75038) Loss: 5.8311 (6.0077)
325
+ 2025-05-08,00:21:21 | INFO | Train Epoch: 1 [117456896/128008192 (92%)] Data (t): 0.170 Batch (t): 6.460, 2536.59/s, 79.2684/s/gpu LR: 0.000546 Logit Scale: 80.692 Class_loss: 5.1223 (5.2550) Contrastive_loss: 0.70415 (0.74957) Loss: 5.8264 (6.0045)
326
+ 2025-05-08,00:31:05 | WARNING | Handling webdataset error (OSError('image file is truncated (38 bytes not processed)')). Ignoring.
327
+ 2025-05-08,00:35:08 | INFO | Train Epoch: 1 [119554048/128008192 (93%)] Data (t): 0.171 Batch (t): 6.461, 2536.35/s, 79.2610/s/gpu LR: 0.000539 Logit Scale: 80.750 Class_loss: 5.0796 (5.2520) Contrastive_loss: 0.75295 (0.74963) Loss: 5.8325 (6.0016)
328
+ 2025-05-08,00:48:55 | INFO | Train Epoch: 1 [121651200/128008192 (95%)] Data (t): 0.172 Batch (t): 6.459, 2533.98/s, 79.1869/s/gpu LR: 0.000533 Logit Scale: 80.819 Class_loss: 5.0765 (5.2490) Contrastive_loss: 0.69245 (0.74866) Loss: 5.7690 (5.9976)
329
+ 2025-05-08,00:53:46 | WARNING | Handling webdataset error (OSError('image file is truncated (89 bytes not processed)')). Ignoring.
330
+ 2025-05-08,01:02:42 | INFO | Train Epoch: 1 [123748352/128008192 (97%)] Data (t): 0.173 Batch (t): 6.460, 2530.80/s, 79.0874/s/gpu LR: 0.000526 Logit Scale: 80.842 Class_loss: 5.1227 (5.2469) Contrastive_loss: 0.73018 (0.74835) Loss: 5.8529 (5.9952)
331
+ 2025-05-08,01:16:29 | INFO | Train Epoch: 1 [125845504/128008192 (98%)] Data (t): 0.171 Batch (t): 6.459, 2535.05/s, 79.2202/s/gpu LR: 0.000520 Logit Scale: 81.000 Class_loss: 5.0571 (5.2438) Contrastive_loss: 0.64980 (0.74674) Loss: 5.7069 (5.9905)
332
+ 2025-05-08,01:28:26 | WARNING | Handling webdataset error (OSError('image file is truncated (46 bytes not processed)')). Ignoring.
333
+ 2025-05-08,01:30:15 | INFO | Train Epoch: 1 [127942656/128008192 (100%)] Data (t): 0.173 Batch (t): 6.454, 2537.80/s, 79.3062/s/gpu LR: 0.000513 Logit Scale: 81.081 Class_loss: 5.1080 (5.2416) Contrastive_loss: 0.66928 (0.74549) Loss: 5.7773 (5.9871)
334
+ 2025-05-08,01:30:41 | INFO | Train Epoch: 1 [128008192/128008192 (100%)] Data (t): 0.177 Batch (t): 6.462, 2540.80/s, 79.4000/s/gpu LR: 0.000513 Logit Scale: 81.086 Class_loss: 4.9845 (5.2375) Contrastive_loss: 0.67522 (0.74437) Loss: 5.6597 (5.9819)
335
+ 2025-05-08,01:31:01 | INFO | Start epoch 2
336
+ 2025-05-08,01:31:11 | INFO | Train Epoch: 2 [ 16384/128008192 (0%)] Data (t): 4.255 Batch (t): 10.437, 1569.77/s, 49.0552/s/gpu LR: 0.000513 Logit Scale: 81.087 Class_loss: 5.0489 (5.0489) Contrastive_loss: 0.74249 (0.74249) Loss: 5.7914 (5.7914)
337
+ 2025-05-08,01:44:58 | INFO | Train Epoch: 2 [ 2113536/128008192 (2%)] Data (t): 0.173 Batch (t): 6.462, 2532.16/s, 79.1299/s/gpu LR: 0.000506 Logit Scale: 81.401 Class_loss: 5.0962 (5.0725) Contrastive_loss: 0.74688 (0.74468) Loss: 5.8430 (5.8172)
338
+ 2025-05-08,01:58:46 | INFO | Train Epoch: 2 [ 4210688/128008192 (3%)] Data (t): 0.173 Batch (t): 6.465, 2536.00/s, 79.2500/s/gpu LR: 0.000500 Logit Scale: 81.587 Class_loss: 5.2349 (5.1266) Contrastive_loss: 0.60020 (0.69652) Loss: 5.8351 (5.8231)
339
+ 2025-05-08,02:12:33 | INFO | Train Epoch: 2 [ 6307840/128008192 (5%)] Data (t): 0.172 Batch (t): 6.458, 2538.02/s, 79.3132/s/gpu LR: 0.000493 Logit Scale: 81.837 Class_loss: 5.0443 (5.1060) Contrastive_loss: 0.69330 (0.69572) Loss: 5.7376 (5.8018)
340
+ 2025-05-08,02:21:10 | WARNING | Handling webdataset error (OSError('image file is truncated (3 bytes not processed)')). Ignoring.
341
+ 2025-05-08,02:26:19 | INFO | Train Epoch: 2 [ 8404992/128008192 (7%)] Data (t): 0.171 Batch (t): 6.457, 2534.96/s, 79.2174/s/gpu LR: 0.000487 Logit Scale: 81.845 Class_loss: 5.0779 (5.1004) Contrastive_loss: 0.56305 (0.66918) Loss: 5.6409 (5.7696)
342
+ 2025-05-08,02:40:07 | INFO | Train Epoch: 2 [ 10502144/128008192 (8%)] Data (t): 0.172 Batch (t): 6.468, 2536.65/s, 79.2703/s/gpu LR: 0.000480 Logit Scale: 81.949 Class_loss: 5.0235 (5.0876) Contrastive_loss: 0.67432 (0.67004) Loss: 5.6978 (5.7576)
343
+ 2025-05-08,02:53:55 | INFO | Train Epoch: 2 [ 12599296/128008192 (10%)] Data (t): 0.172 Batch (t): 6.466, 2536.52/s, 79.2661/s/gpu LR: 0.000474 Logit Scale: 82.155 Class_loss: 5.0724 (5.0854) Contrastive_loss: 0.57893 (0.65702) Loss: 5.6513 (5.7424)
344
+ 2025-05-08,03:07:41 | INFO | Train Epoch: 2 [ 14696448/128008192 (11%)] Data (t): 0.174 Batch (t): 6.460, 2531.43/s, 79.1073/s/gpu LR: 0.000467 Logit Scale: 82.325 Class_loss: 5.1322 (5.0913) Contrastive_loss: 0.73022 (0.66617) Loss: 5.8624 (5.7574)
345
+ 2025-05-08,03:21:29 | INFO | Train Epoch: 2 [ 16793600/128008192 (13%)] Data (t): 0.173 Batch (t): 6.463, 2537.96/s, 79.3111/s/gpu LR: 0.000461 Logit Scale: 82.318 Class_loss: 5.0125 (5.0825) Contrastive_loss: 0.81009 (0.68216) Loss: 5.8226 (5.7647)
346
+ 2025-05-08,03:30:06 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
347
+ 2025-05-08,03:35:15 | INFO | Train Epoch: 2 [ 18890752/128008192 (15%)] Data (t): 0.173 Batch (t): 6.458, 2536.59/s, 79.2685/s/gpu LR: 0.000454 Logit Scale: 82.531 Class_loss: 5.1086 (5.0851) Contrastive_loss: 0.65788 (0.67974) Loss: 5.7665 (5.7649)
348
+ 2025-05-08,03:49:02 | INFO | Train Epoch: 2 [ 20987904/128008192 (16%)] Data (t): 0.173 Batch (t): 6.460, 2535.58/s, 79.2368/s/gpu LR: 0.000447 Logit Scale: 82.637 Class_loss: 5.1280 (5.0890) Contrastive_loss: 0.73041 (0.68434) Loss: 5.8584 (5.7734)
349
+ 2025-05-08,04:02:49 | INFO | Train Epoch: 2 [ 23085056/128008192 (18%)] Data (t): 0.174 Batch (t): 6.458, 2534.21/s, 79.1942/s/gpu LR: 0.000441 Logit Scale: 82.640 Class_loss: 5.1238 (5.0919) Contrastive_loss: 0.62751 (0.67961) Loss: 5.7513 (5.7715)
350
+ 2025-05-08,04:05:18 | WARNING | Handling webdataset error (OSError('image file is truncated (82 bytes not processed)')). Ignoring.
351
+ 2025-05-08,04:16:35 | INFO | Train Epoch: 2 [ 25182208/128008192 (20%)] Data (t): 0.174 Batch (t): 6.458, 2537.17/s, 79.2865/s/gpu LR: 0.000435 Logit Scale: 82.883 Class_loss: 4.9793 (5.0833) Contrastive_loss: 0.89966 (0.69653) Loss: 5.8790 (5.7798)
352
+ 2025-05-08,04:30:22 | INFO | Train Epoch: 2 [ 27279360/128008192 (21%)] Data (t): 0.175 Batch (t): 6.460, 2541.16/s, 79.4111/s/gpu LR: 0.000428 Logit Scale: 82.977 Class_loss: 5.1451 (5.0877) Contrastive_loss: 0.67111 (0.69472) Loss: 5.8162 (5.7824)
353
+ 2025-05-08,04:44:09 | INFO | Train Epoch: 2 [ 29376512/128008192 (23%)] Data (t): 0.174 Batch (t): 6.455, 2535.26/s, 79.2270/s/gpu LR: 0.000422 Logit Scale: 83.060 Class_loss: 4.9808 (5.0806) Contrastive_loss: 0.78755 (0.70091) Loss: 5.7684 (5.7815)
354
+ 2025-05-08,04:57:55 | INFO | Train Epoch: 2 [ 31473664/128008192 (25%)] Data (t): 0.171 Batch (t): 6.456, 2540.54/s, 79.3918/s/gpu LR: 0.000415 Logit Scale: 83.335 Class_loss: 5.0677 (5.0797) Contrastive_loss: 0.71934 (0.70206) Loss: 5.7870 (5.7818)
355
+ 2025-05-08,05:11:41 | INFO | Train Epoch: 2 [ 33570816/128008192 (26%)] Data (t): 0.170 Batch (t): 6.456, 2537.05/s, 79.2828/s/gpu LR: 0.000409 Logit Scale: 83.305 Class_loss: 5.0477 (5.0779) Contrastive_loss: 0.73805 (0.70418) Loss: 5.7857 (5.7820)
356
+ 2025-05-08,05:25:28 | INFO | Train Epoch: 2 [ 35667968/128008192 (28%)] Data (t): 0.171 Batch (t): 6.457, 2535.14/s, 79.2230/s/gpu LR: 0.000402 Logit Scale: 83.460 Class_loss: 5.0747 (5.0777) Contrastive_loss: 0.56946 (0.69669) Loss: 5.6441 (5.7744)
357
+ 2025-05-08,05:39:12 | INFO | Train Epoch: 2 [ 37765120/128008192 (30%)] Data (t): 0.172 Batch (t): 6.439, 2542.42/s, 79.4505/s/gpu LR: 0.000396 Logit Scale: 83.640 Class_loss: 5.0313 (5.0752) Contrastive_loss: 0.78446 (0.70131) Loss: 5.8158 (5.7766)
358
+ 2025-05-08,05:52:58 | INFO | Train Epoch: 2 [ 39862272/128008192 (31%)] Data (t): 0.172 Batch (t): 6.454, 2538.04/s, 79.3137/s/gpu LR: 0.000389 Logit Scale: 83.800 Class_loss: 5.0229 (5.0726) Contrastive_loss: 0.81314 (0.70690) Loss: 5.8361 (5.7795)
359
+ 2025-05-08,05:59:08 | WARNING | Handling webdataset error (OSError('image file is truncated (230 bytes not processed)')). Ignoring.
360
+ 2025-05-08,05:59:33 | WARNING | Handling webdataset error (OSError('image file is truncated (76 bytes not processed)')). Ignoring.
361
+ 2025-05-08,06:06:44 | INFO | Train Epoch: 2 [ 41959424/128008192 (33%)] Data (t): 0.171 Batch (t): 6.453, 2535.80/s, 79.2438/s/gpu LR: 0.000383 Logit Scale: 84.047 Class_loss: 5.0348 (5.0708) Contrastive_loss: 0.78737 (0.71073) Loss: 5.8222 (5.7816)
362
+ 2025-05-08,06:09:59 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
363
+ 2025-05-08,06:20:45 | INFO | Train Epoch: 2 [ 44056576/128008192 (34%)] Data (t): 0.281 Batch (t): 6.567, 2539.98/s, 79.3742/s/gpu LR: 0.000377 Logit Scale: 84.225 Class_loss: 5.0350 (5.0692) Contrastive_loss: 0.53128 (0.70258) Loss: 5.5663 (5.7718)
364
+ 2025-05-08,06:34:31 | INFO | Train Epoch: 2 [ 46153728/128008192 (36%)] Data (t): 0.172 Batch (t): 6.457, 2538.28/s, 79.3214/s/gpu LR: 0.000370 Logit Scale: 84.313 Class_loss: 5.0768 (5.0695) Contrastive_loss: 0.51690 (0.69450) Loss: 5.5937 (5.7640)
365
+ 2025-05-08,06:48:18 | INFO | Train Epoch: 2 [ 48250880/128008192 (38%)] Data (t): 0.173 Batch (t): 6.458, 2534.75/s, 79.2111/s/gpu LR: 0.000364 Logit Scale: 84.459 Class_loss: 5.0405 (5.0683) Contrastive_loss: 0.60544 (0.69079) Loss: 5.6459 (5.7591)
366
+ 2025-05-08,07:02:06 | INFO | Train Epoch: 2 [ 50348032/128008192 (39%)] Data (t): 0.173 Batch (t): 6.472, 2538.27/s, 79.3210/s/gpu LR: 0.000358 Logit Scale: 84.406 Class_loss: 5.0669 (5.0683) Contrastive_loss: 0.57160 (0.68603) Loss: 5.6386 (5.7543)
367
+ 2025-05-08,07:15:54 | INFO | Train Epoch: 2 [ 52445184/128008192 (41%)] Data (t): 0.172 Batch (t): 6.464, 2538.93/s, 79.3416/s/gpu LR: 0.000352 Logit Scale: 84.603 Class_loss: 4.9465 (5.0636) Contrastive_loss: 0.67371 (0.68555) Loss: 5.6202 (5.7491)
368
+ 2025-05-08,07:29:40 | INFO | Train Epoch: 2 [ 54542336/128008192 (43%)] Data (t): 0.172 Batch (t): 6.456, 2531.04/s, 79.0948/s/gpu LR: 0.000345 Logit Scale: 84.782 Class_loss: 4.9565 (5.0596) Contrastive_loss: 0.62398 (0.68327) Loss: 5.5805 (5.7429)
369
+ 2025-05-08,07:43:26 | INFO | Train Epoch: 2 [ 56639488/128008192 (44%)] Data (t): 0.173 Batch (t): 6.455, 2539.80/s, 79.3687/s/gpu LR: 0.000339 Logit Scale: 84.876 Class_loss: 5.0103 (5.0579) Contrastive_loss: 0.68378 (0.68329) Loss: 5.6941 (5.7411)
370
+ 2025-05-08,07:57:13 | INFO | Train Epoch: 2 [ 58736640/128008192 (46%)] Data (t): 0.173 Batch (t): 6.460, 2535.78/s, 79.2432/s/gpu LR: 0.000333 Logit Scale: 85.054 Class_loss: 5.0557 (5.0578) Contrastive_loss: 0.49891 (0.67693) Loss: 5.5546 (5.7347)
371
+ 2025-05-08,08:11:01 | INFO | Train Epoch: 2 [ 60833792/128008192 (48%)] Data (t): 0.173 Batch (t): 6.464, 2534.34/s, 79.1982/s/gpu LR: 0.000327 Logit Scale: 85.278 Class_loss: 5.0352 (5.0570) Contrastive_loss: 0.64086 (0.67573) Loss: 5.6761 (5.7328)
372
+ 2025-05-08,08:24:49 | INFO | Train Epoch: 2 [ 62930944/128008192 (49%)] Data (t): 0.172 Batch (t): 6.474, 2537.84/s, 79.3075/s/gpu LR: 0.000321 Logit Scale: 85.461 Class_loss: 5.0420 (5.0565) Contrastive_loss: 0.64237 (0.67465) Loss: 5.6844 (5.7312)
373
+ 2025-05-08,08:30:41 | WARNING | Handling webdataset error (OSError('image file is truncated (8 bytes not processed)')). Ignoring.
374
+ 2025-05-08,08:38:41 | INFO | Train Epoch: 2 [ 65028096/128008192 (51%)] Data (t): 0.172 Batch (t): 6.500, 2521.83/s, 78.8071/s/gpu LR: 0.000315 Logit Scale: 85.535 Class_loss: 4.9550 (5.0534) Contrastive_loss: 0.65615 (0.67407) Loss: 5.6111 (5.7274)
375
+ 2025-05-08,08:43:54 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
376
+ 2025-05-08,08:52:33 | INFO | Train Epoch: 2 [ 67125248/128008192 (52%)] Data (t): 0.172 Batch (t): 6.500, 2518.41/s, 78.7004/s/gpu LR: 0.000309 Logit Scale: 85.633 Class_loss: 5.0019 (5.0518) Contrastive_loss: 0.52979 (0.66970) Loss: 5.5317 (5.7215)
377
+ 2025-05-08,09:06:25 | INFO | Train Epoch: 2 [ 69222400/128008192 (54%)] Data (t): 0.172 Batch (t): 6.500, 2521.03/s, 78.7822/s/gpu LR: 0.000303 Logit Scale: 85.695 Class_loss: 5.0871 (5.0528) Contrastive_loss: 0.65699 (0.66933) Loss: 5.7441 (5.7222)
378
+ 2025-05-08,09:20:17 | INFO | Train Epoch: 2 [ 71319552/128008192 (56%)] Data (t): 0.171 Batch (t): 6.500, 2516.69/s, 78.6464/s/gpu LR: 0.000297 Logit Scale: 85.889 Class_loss: 4.9729 (5.0506) Contrastive_loss: 0.60391 (0.66746) Loss: 5.5768 (5.7180)
379
+ 2025-05-08,09:34:09 | INFO | Train Epoch: 2 [ 73416704/128008192 (57%)] Data (t): 0.172 Batch (t): 6.500, 2521.57/s, 78.7991/s/gpu LR: 0.000291 Logit Scale: 85.933 Class_loss: 5.0524 (5.0506) Contrastive_loss: 0.54011 (0.66392) Loss: 5.5925 (5.7145)
380
+ 2025-05-08,09:41:00 | WARNING | Handling webdataset error (OSError('image file is truncated (34 bytes not processed)')). Ignoring.
381
+ 2025-05-08,09:48:01 | INFO | Train Epoch: 2 [ 75513856/128008192 (59%)] Data (t): 0.172 Batch (t): 6.499, 2520.94/s, 78.7795/s/gpu LR: 0.000285 Logit Scale: 86.066 Class_loss: 5.0166 (5.0497) Contrastive_loss: 0.58110 (0.66168) Loss: 5.5977 (5.7114)
382
+ 2025-05-08,10:01:54 | INFO | Train Epoch: 2 [ 77611008/128008192 (61%)] Data (t): 0.171 Batch (t): 6.503, 2514.25/s, 78.5704/s/gpu LR: 0.000279 Logit Scale: 86.173 Class_loss: 5.0296 (5.0492) Contrastive_loss: 0.50575 (0.65758) Loss: 5.5353 (5.7067)
383
+ 2025-05-08,10:02:01 | WARNING | Handling webdataset error (OSError('image file is truncated (37 bytes not processed)')). Ignoring.
384
+ 2025-05-08,10:07:13 | WARNING | Handling webdataset error (OSError('image file is truncated (80 bytes not processed)')). Ignoring.
385
+ 2025-05-08,10:15:46 | INFO | Train Epoch: 2 [ 79708160/128008192 (62%)] Data (t): 0.170 Batch (t): 6.501, 2521.74/s, 78.8042/s/gpu LR: 0.000273 Logit Scale: 86.310 Class_loss: 5.0238 (5.0485) Contrastive_loss: 0.62608 (0.65677) Loss: 5.6498 (5.7053)
386
+ 2025-05-08,10:28:34 | WARNING | Handling webdataset error (OSError('image file is truncated (60 bytes not processed)')). Ignoring.
387
+ 2025-05-08,10:29:38 | INFO | Train Epoch: 2 [ 81805312/128008192 (64%)] Data (t): 0.171 Batch (t): 6.499, 2522.69/s, 78.8340/s/gpu LR: 0.000267 Logit Scale: 86.460 Class_loss: 4.9463 (5.0460) Contrastive_loss: 0.86148 (0.66189) Loss: 5.8078 (5.7079)
388
+ 2025-05-08,10:43:30 | INFO | Train Epoch: 2 [ 83902464/128008192 (66%)] Data (t): 0.172 Batch (t): 6.500, 2522.71/s, 78.8347/s/gpu LR: 0.000261 Logit Scale: 86.624 Class_loss: 5.0320 (5.0456) Contrastive_loss: 0.58933 (0.66012) Loss: 5.6213 (5.7057)
389
+ 2025-05-08,10:57:22 | INFO | Train Epoch: 2 [ 85999616/128008192 (67%)] Data (t): 0.172 Batch (t): 6.501, 2515.15/s, 78.5983/s/gpu LR: 0.000256 Logit Scale: 86.771 Class_loss: 4.9565 (5.0435) Contrastive_loss: 0.64005 (0.65964) Loss: 5.5966 (5.7031)
390
+ 2025-05-08,11:11:15 | INFO | Train Epoch: 2 [ 88096768/128008192 (69%)] Data (t): 0.172 Batch (t): 6.510, 2517.77/s, 78.6803/s/gpu LR: 0.000250 Logit Scale: 86.974 Class_loss: 4.9269 (5.0408) Contrastive_loss: 0.71927 (0.66103) Loss: 5.6462 (5.7018)
391
+ 2025-05-08,11:25:08 | INFO | Train Epoch: 2 [ 90193920/128008192 (70%)] Data (t): 0.174 Batch (t): 6.507, 2521.93/s, 78.8104/s/gpu LR: 0.000244 Logit Scale: 87.147 Class_loss: 4.9681 (5.0391) Contrastive_loss: 0.65707 (0.66094) Loss: 5.6252 (5.7001)
392
+ 2025-05-08,11:31:45 | WARNING | Handling webdataset error (OSError('image file is truncated (16 bytes not processed)')). Ignoring.
393
+ 2025-05-08,11:39:00 | INFO | Train Epoch: 2 [ 92291072/128008192 (72%)] Data (t): 0.172 Batch (t): 6.504, 2521.02/s, 78.7818/s/gpu LR: 0.000239 Logit Scale: 87.338 Class_loss: 4.8998 (5.0360) Contrastive_loss: 0.78526 (0.66370) Loss: 5.6851 (5.6997)
394
+ 2025-05-08,11:52:52 | INFO | Train Epoch: 2 [ 94388224/128008192 (74%)] Data (t): 0.173 Batch (t): 6.501, 2522.01/s, 78.8128/s/gpu LR: 0.000233 Logit Scale: 87.378 Class_loss: 4.9545 (5.0343) Contrastive_loss: 0.67425 (0.66393) Loss: 5.6287 (5.6982)
395
+ 2025-05-08,12:06:45 | INFO | Train Epoch: 2 [ 96485376/128008192 (75%)] Data (t): 0.173 Batch (t): 6.500, 2522.05/s, 78.8142/s/gpu LR: 0.000228 Logit Scale: 87.604 Class_loss: 4.9647 (5.0328) Contrastive_loss: 0.58921 (0.66234) Loss: 5.5539 (5.6951)
396
+ 2025-05-08,12:20:37 | INFO | Train Epoch: 2 [ 98582528/128008192 (77%)] Data (t): 0.173 Batch (t): 6.500, 2515.79/s, 78.6185/s/gpu LR: 0.000222 Logit Scale: 87.685 Class_loss: 4.9953 (5.0320) Contrastive_loss: 0.66281 (0.66235) Loss: 5.6581 (5.6944)
397
+ 2025-05-08,12:34:29 | INFO | Train Epoch: 2 [100679680/128008192 (79%)] Data (t): 0.173 Batch (t): 6.505, 2520.38/s, 78.7618/s/gpu LR: 0.000217 Logit Scale: 87.842 Class_loss: 4.9383 (5.0301) Contrastive_loss: 0.68552 (0.66282) Loss: 5.6238 (5.6929)
398
+ 2025-05-08,12:48:22 | INFO | Train Epoch: 2 [102776832/128008192 (80%)] Data (t): 0.173 Batch (t): 6.503, 2516.28/s, 78.6339/s/gpu LR: 0.000211 Logit Scale: 87.976 Class_loss: 4.9951 (5.0294) Contrastive_loss: 0.53852 (0.66034) Loss: 5.5336 (5.6897)
399
+ 2025-05-08,13:02:14 | INFO | Train Epoch: 2 [104873984/128008192 (82%)] Data (t): 0.173 Batch (t): 6.502, 2522.11/s, 78.8160/s/gpu LR: 0.000206 Logit Scale: 88.178 Class_loss: 5.0601 (5.0300) Contrastive_loss: 0.60612 (0.65927) Loss: 5.6662 (5.6893)
400
+ 2025-05-08,13:16:06 | INFO | Train Epoch: 2 [106971136/128008192 (84%)] Data (t): 0.174 Batch (t): 6.504, 2520.28/s, 78.7586/s/gpu LR: 0.000201 Logit Scale: 88.339 Class_loss: 4.9656 (5.0288) Contrastive_loss: 0.77804 (0.66156) Loss: 5.7436 (5.6903)
401
+ 2025-05-08,13:29:59 | INFO | Train Epoch: 2 [109068288/128008192 (85%)] Data (t): 0.173 Batch (t): 6.502, 2517.84/s, 78.6824/s/gpu LR: 0.000196 Logit Scale: 88.460 Class_loss: 4.8914 (5.0262) Contrastive_loss: 0.75832 (0.66338) Loss: 5.6497 (5.6895)
402
+ 2025-05-08,13:40:37 | WARNING | Handling webdataset error (OSError('image file is truncated (80 bytes not processed)')). Ignoring.
403
+ 2025-05-08,13:43:51 | INFO | Train Epoch: 2 [111165440/128008192 (87%)] Data (t): 0.172 Batch (t): 6.502, 2520.77/s, 78.7741/s/gpu LR: 0.000190 Logit Scale: 88.642 Class_loss: 4.9289 (5.0244) Contrastive_loss: 0.61847 (0.66255) Loss: 5.5474 (5.6869)
404
+ 2025-05-08,13:53:06 | WARNING | Handling webdataset error (OSError('image file is truncated (19 bytes not processed)')). Ignoring.
405
+ 2025-05-08,13:57:43 | INFO | Train Epoch: 2 [113262592/128008192 (88%)] Data (t): 0.173 Batch (t): 6.503, 2521.84/s, 78.8074/s/gpu LR: 0.000185 Logit Scale: 88.778 Class_loss: 4.9495 (5.0230) Contrastive_loss: 0.62458 (0.66186) Loss: 5.5741 (5.6849)
406
+ 2025-05-08,14:11:36 | INFO | Train Epoch: 2 [115359744/128008192 (90%)] Data (t): 0.172 Batch (t): 6.502, 2522.52/s, 78.8287/s/gpu LR: 0.000180 Logit Scale: 88.838 Class_loss: 4.8618 (5.0201) Contrastive_loss: 0.59096 (0.66060) Loss: 5.4528 (5.6807)
407
+ 2025-05-08,14:16:35 | WARNING | Handling webdataset error (OSError('image file is truncated (8 bytes not processed)')). Ignoring.
408
+ 2025-05-08,14:25:28 | INFO | Train Epoch: 2 [117456896/128008192 (92%)] Data (t): 0.173 Batch (t): 6.504, 2522.05/s, 78.8142/s/gpu LR: 0.000175 Logit Scale: 89.047 Class_loss: 5.0108 (5.0200) Contrastive_loss: 0.53683 (0.65842) Loss: 5.5476 (5.6784)
409
+ 2025-05-08,14:39:21 | INFO | Train Epoch: 2 [119554048/128008192 (93%)] Data (t): 0.172 Batch (t): 6.505, 2514.61/s, 78.5817/s/gpu LR: 0.000170 Logit Scale: 89.107 Class_loss: 4.9372 (5.0185) Contrastive_loss: 0.81388 (0.66110) Loss: 5.7510 (5.6796)
410
+ 2025-05-08,14:53:14 | INFO | Train Epoch: 2 [121651200/128008192 (95%)] Data (t): 0.173 Batch (t): 6.505, 2522.43/s, 78.8261/s/gpu LR: 0.000165 Logit Scale: 89.221 Class_loss: 4.9516 (5.0174) Contrastive_loss: 0.62872 (0.66056) Loss: 5.5803 (5.6780)
411
+ 2025-05-08,15:00:17 | WARNING | Handling webdataset error (OSError('image file is truncated (96 bytes not processed)')). Ignoring.
412
+ 2025-05-08,15:07:06 | INFO | Train Epoch: 2 [123748352/128008192 (97%)] Data (t): 0.173 Batch (t): 6.506, 2519.84/s, 78.7451/s/gpu LR: 0.000161 Logit Scale: 89.409 Class_loss: 4.8393 (5.0144) Contrastive_loss: 0.81049 (0.66305) Loss: 5.6498 (5.6775)
413
+ 2025-05-08,15:20:59 | INFO | Train Epoch: 2 [125845504/128008192 (98%)] Data (t): 0.173 Batch (t): 6.504, 2518.75/s, 78.7110/s/gpu LR: 0.000156 Logit Scale: 89.480 Class_loss: 4.9603 (5.0135) Contrastive_loss: 0.66749 (0.66313) Loss: 5.6278 (5.6767)
414
+ 2025-05-08,15:25:01 | WARNING | Handling webdataset error (OSError('image file is truncated (29 bytes not processed)')). Ignoring.
415
+ 2025-05-08,15:34:53 | INFO | Train Epoch: 2 [127942656/128008192 (100%)] Data (t): 0.172 Batch (t): 6.514, 2519.92/s, 78.7475/s/gpu LR: 0.000151 Logit Scale: 89.604 Class_loss: 4.9684 (5.0128) Contrastive_loss: 0.60321 (0.66216) Loss: 5.5716 (5.6750)
416
+ 2025-05-08,15:35:19 | INFO | Train Epoch: 2 [128008192/128008192 (100%)] Data (t): 0.185 Batch (t): 6.510, 2518.22/s, 78.6942/s/gpu LR: 0.000151 Logit Scale: 89.606 Class_loss: 5.0344 (5.0132) Contrastive_loss: 0.57158 (0.66072) Loss: 5.6060 (5.6739)
417
+ 2025-05-08,15:35:37 | INFO | Start epoch 3
418
+ 2025-05-08,15:35:48 | INFO | Train Epoch: 3 [ 16384/128008192 (0%)] Data (t): 4.694 Batch (t): 10.881, 1505.78/s, 47.0555/s/gpu LR: 0.000151 Logit Scale: 89.606 Class_loss: 4.9431 (4.9431) Contrastive_loss: 0.66665 (0.66665) Loss: 5.6097 (5.6097)
419
+ 2025-05-08,15:49:43 | INFO | Train Epoch: 3 [ 2113536/128008192 (2%)] Data (t): 0.174 Batch (t): 6.517, 2512.54/s, 78.5168/s/gpu LR: 0.000146 Logit Scale: 89.970 Class_loss: 4.8583 (4.9007) Contrastive_loss: 0.61650 (0.64158) Loss: 5.4748 (5.5423)
420
+ 2025-05-08,16:03:35 | INFO | Train Epoch: 3 [ 4210688/128008192 (3%)] Data (t): 0.173 Batch (t): 6.506, 2523.73/s, 78.8667/s/gpu LR: 0.000142 Logit Scale: 90.250 Class_loss: 4.8939 (4.8984) Contrastive_loss: 0.66637 (0.64984) Loss: 5.5603 (5.5483)
421
+ 2025-05-08,16:17:28 | INFO | Train Epoch: 3 [ 6307840/128008192 (5%)] Data (t): 0.172 Batch (t): 6.503, 2520.04/s, 78.7514/s/gpu LR: 0.000137 Logit Scale: 90.482 Class_loss: 4.8564 (4.8879) Contrastive_loss: 0.71684 (0.66659) Loss: 5.5733 (5.5545)
422
+ 2025-05-08,16:28:44 | WARNING | Handling webdataset error (OSError('image file is truncated (26 bytes not processed)')). Ignoring.
423
+ 2025-05-08,16:31:20 | INFO | Train Epoch: 3 [ 8404992/128008192 (7%)] Data (t): 0.172 Batch (t): 6.500, 2511.10/s, 78.4718/s/gpu LR: 0.000133 Logit Scale: 90.567 Class_loss: 4.8821 (4.8868) Contrastive_loss: 0.52018 (0.63731) Loss: 5.4023 (5.5241)
424
+ 2025-05-08,16:45:11 | INFO | Train Epoch: 3 [ 10502144/128008192 (8%)] Data (t): 0.173 Batch (t): 6.496, 2555.92/s, 79.8724/s/gpu LR: 0.000128 Logit Scale: 90.706 Class_loss: 4.7122 (4.8577) Contrastive_loss: 0.65534 (0.64031) Loss: 5.3675 (5.4980)
425
+ 2025-05-08,16:58:51 | INFO | Train Epoch: 3 [ 12599296/128008192 (10%)] Data (t): 0.173 Batch (t): 6.405, 2558.85/s, 79.9640/s/gpu LR: 0.000124 Logit Scale: 90.908 Class_loss: 4.8733 (4.8599) Contrastive_loss: 0.59795 (0.63426) Loss: 5.4712 (5.4941)
426
+ 2025-05-08,17:11:16 | WARNING | Handling webdataset error (OSError('broken data stream when reading image file')). Ignoring.
427
+ 2025-05-08,17:12:31 | INFO | Train Epoch: 3 [ 14696448/128008192 (11%)] Data (t): 0.173 Batch (t): 6.404, 2557.92/s, 79.9351/s/gpu LR: 0.000120 Logit Scale: 91.139 Class_loss: 4.8800 (4.8624) Contrastive_loss: 0.51033 (0.61877) Loss: 5.3904 (5.4812)
428
+ 2025-05-08,17:26:24 | INFO | Train Epoch: 3 [ 16793600/128008192 (13%)] Data (t): 0.276 Batch (t): 6.511, 2547.53/s, 79.6102/s/gpu LR: 0.000116 Logit Scale: 91.324 Class_loss: 4.7195 (4.8465) Contrastive_loss: 0.74766 (0.63309) Loss: 5.4672 (5.4796)
429
+ 2025-05-08,17:40:04 | INFO | Train Epoch: 3 [ 18890752/128008192 (15%)] Data (t): 0.174 Batch (t): 6.407, 2561.53/s, 80.0478/s/gpu LR: 0.000111 Logit Scale: 91.449 Class_loss: 4.8508 (4.8470) Contrastive_loss: 0.65068 (0.63485) Loss: 5.5015 (5.4818)
430
+ 2025-05-08,17:53:44 | INFO | Train Epoch: 3 [ 20987904/128008192 (16%)] Data (t): 0.173 Batch (t): 6.405, 2559.23/s, 79.9759/s/gpu LR: 0.000107 Logit Scale: 91.633 Class_loss: 4.8085 (4.8435) Contrastive_loss: 0.62785 (0.63421) Loss: 5.4364 (5.4777)
431
+ 2025-05-08,18:07:24 | INFO | Train Epoch: 3 [ 23085056/128008192 (18%)] Data (t): 0.173 Batch (t): 6.407, 2558.08/s, 79.9401/s/gpu LR: 0.000103 Logit Scale: 91.748 Class_loss: 4.9189 (4.8498) Contrastive_loss: 0.55661 (0.62775) Loss: 5.4755 (5.4775)
432
+ 2025-05-08,18:21:04 | INFO | Train Epoch: 3 [ 25182208/128008192 (20%)] Data (t): 0.172 Batch (t): 6.406, 2559.72/s, 79.9913/s/gpu LR: 0.000099 Logit Scale: 91.910 Class_loss: 4.8678 (4.8511) Contrastive_loss: 0.65234 (0.62964) Loss: 5.5201 (5.4808)
433
+ 2025-05-08,18:34:44 | INFO | Train Epoch: 3 [ 27279360/128008192 (21%)] Data (t): 0.173 Batch (t): 6.407, 2555.69/s, 79.8652/s/gpu LR: 0.000095 Logit Scale: 92.029 Class_loss: 4.9491 (4.8581) Contrastive_loss: 0.54874 (0.62386) Loss: 5.4979 (5.4820)
434
+ 2025-05-08,18:45:19 | WARNING | Handling webdataset error (OSError('image file is truncated (12 bytes not processed)')). Ignoring.
435
+ 2025-05-08,18:48:24 | INFO | Train Epoch: 3 [ 29376512/128008192 (23%)] Data (t): 0.173 Batch (t): 6.405, 2561.47/s, 80.0460/s/gpu LR: 0.000092 Logit Scale: 92.214 Class_loss: 4.8274 (4.8561) Contrastive_loss: 0.77421 (0.63388) Loss: 5.6017 (5.4900)
436
+ 2025-05-08,18:52:22 | WARNING | Handling webdataset error (OSError('image file is truncated (9 bytes not processed)')). Ignoring.
437
+ 2025-05-08,18:59:11 | WARNING | Handling webdataset error (OSError('image file is truncated (17 bytes not processed)')). Ignoring.
438
+ 2025-05-08,19:01:34 | WARNING | Handling webdataset error (OSError('image file is truncated (4 bytes not processed)')). Ignoring.
439
+ 2025-05-08,19:02:04 | INFO | Train Epoch: 3 [ 31473664/128008192 (25%)] Data (t): 0.172 Batch (t): 6.406, 2557.10/s, 79.9092/s/gpu LR: 0.000088 Logit Scale: 92.390 Class_loss: 4.9488 (4.8619) Contrastive_loss: 0.52545 (0.62711) Loss: 5.4743 (5.4890)
440
+ 2025-05-08,19:15:44 | INFO | Train Epoch: 3 [ 33570816/128008192 (26%)] Data (t): 0.173 Batch (t): 6.407, 2551.64/s, 79.7387/s/gpu LR: 0.000084 Logit Scale: 92.586 Class_loss: 4.9083 (4.8646) Contrastive_loss: 0.55629 (0.62294) Loss: 5.4646 (5.4876)
441
+ 2025-05-08,19:20:09 | WARNING | Handling webdataset error (OSError('image file is truncated (66 bytes not processed)')). Ignoring.
442
+ 2025-05-08,19:29:24 | INFO | Train Epoch: 3 [ 35667968/128008192 (28%)] Data (t): 0.173 Batch (t): 6.404, 2560.35/s, 80.0109/s/gpu LR: 0.000081 Logit Scale: 92.768 Class_loss: 4.9006 (4.8666) Contrastive_loss: 0.54747 (0.61875) Loss: 5.4480 (5.4854)
443
+ 2025-05-08,19:43:03 | INFO | Train Epoch: 3 [ 37765120/128008192 (30%)] Data (t): 0.172 Batch (t): 6.403, 2559.06/s, 79.9706/s/gpu LR: 0.000077 Logit Scale: 92.911 Class_loss: 4.7921 (4.8627) Contrastive_loss: 0.68793 (0.62239) Loss: 5.4801 (5.4851)
444
+ 2025-05-08,19:56:45 | INFO | Train Epoch: 3 [ 39862272/128008192 (31%)] Data (t): 0.173 Batch (t): 6.421, 2560.57/s, 80.0178/s/gpu LR: 0.000074 Logit Scale: 93.054 Class_loss: 4.8285 (4.8610) Contrastive_loss: 0.66487 (0.62451) Loss: 5.4934 (5.4855)
445
+ 2025-05-08,20:10:26 | INFO | Train Epoch: 3 [ 41959424/128008192 (33%)] Data (t): 0.173 Batch (t): 6.411, 2557.55/s, 79.9235/s/gpu LR: 0.000070 Logit Scale: 93.222 Class_loss: 4.8765 (4.8617) Contrastive_loss: 0.64478 (0.62548) Loss: 5.5213 (5.4872)
446
+ 2025-05-08,20:11:00 | WARNING | Handling webdataset error (OSError('image file is truncated (54 bytes not processed)')). Ignoring.
447
+ 2025-05-08,20:24:06 | INFO | Train Epoch: 3 [ 44056576/128008192 (34%)] Data (t): 0.172 Batch (t): 6.402, 2562.62/s, 80.0817/s/gpu LR: 0.000067 Logit Scale: 93.322 Class_loss: 4.8363 (4.8606) Contrastive_loss: 0.70944 (0.62929) Loss: 5.5457 (5.4899)
448
+ 2025-05-08,20:35:40 | WARNING | Handling webdataset error (OSError('image file is truncated (48 bytes not processed)')). Ignoring.
449
+ 2025-05-08,20:37:46 | INFO | Train Epoch: 3 [ 46153728/128008192 (36%)] Data (t): 0.173 Batch (t): 6.407, 2561.62/s, 80.0507/s/gpu LR: 0.000064 Logit Scale: 93.472 Class_loss: 4.8965 (4.8621) Contrastive_loss: 0.65980 (0.63062) Loss: 5.5563 (5.4927)
450
+ 2025-05-08,20:51:25 | INFO | Train Epoch: 3 [ 48250880/128008192 (38%)] Data (t): 0.172 Batch (t): 6.404, 2553.65/s, 79.8017/s/gpu LR: 0.000061 Logit Scale: 93.577 Class_loss: 4.8316 (4.8609) Contrastive_loss: 0.68649 (0.63295) Loss: 5.5181 (5.4938)
451
+ 2025-05-08,21:01:16 | WARNING | Handling webdataset error (OSError('image file is truncated (152 bytes not processed)')). Ignoring.
452
+ 2025-05-08,21:05:05 | INFO | Train Epoch: 3 [ 50348032/128008192 (39%)] Data (t): 0.173 Batch (t): 6.405, 2555.58/s, 79.8620/s/gpu LR: 0.000058 Logit Scale: 93.746 Class_loss: 4.9402 (4.8640) Contrastive_loss: 0.68207 (0.63491) Loss: 5.6223 (5.4989)
453
+ 2025-05-08,21:18:45 | INFO | Train Epoch: 3 [ 52445184/128008192 (41%)] Data (t): 0.172 Batch (t): 6.403, 2556.92/s, 79.9039/s/gpu LR: 0.000055 Logit Scale: 93.842 Class_loss: 4.7998 (4.8616) Contrastive_loss: 0.58303 (0.63292) Loss: 5.3828 (5.4945)
454
+ 2025-05-08,21:32:24 | INFO | Train Epoch: 3 [ 54542336/128008192 (43%)] Data (t): 0.172 Batch (t): 6.404, 2553.92/s, 79.8101/s/gpu LR: 0.000052 Logit Scale: 93.971 Class_loss: 4.7560 (4.8577) Contrastive_loss: 0.59639 (0.63157) Loss: 5.3524 (5.4892)
455
+ 2025-05-08,21:46:04 | INFO | Train Epoch: 3 [ 56639488/128008192 (44%)] Data (t): 0.172 Batch (t): 6.404, 2563.12/s, 80.0974/s/gpu LR: 0.000049 Logit Scale: 94.106 Class_loss: 4.7867 (4.8551) Contrastive_loss: 0.92803 (0.64215) Loss: 5.7147 (5.4973)
456
+ 2025-05-08,21:56:20 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
457
+ 2025-05-08,21:59:44 | INFO | Train Epoch: 3 [ 58736640/128008192 (46%)] Data (t): 0.174 Batch (t): 6.406, 2561.98/s, 80.0619/s/gpu LR: 0.000046 Logit Scale: 94.269 Class_loss: 4.8507 (4.8550) Contrastive_loss: 0.52894 (0.63825) Loss: 5.3797 (5.4932)
458
+ 2025-05-08,22:04:33 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
459
+ 2025-05-08,22:13:25 | INFO | Train Epoch: 3 [ 60833792/128008192 (48%)] Data (t): 0.175 Batch (t): 6.411, 2535.23/s, 79.2260/s/gpu LR: 0.000043 Logit Scale: 94.372 Class_loss: 4.9173 (4.8570) Contrastive_loss: 0.51911 (0.63428) Loss: 5.4364 (5.4913)
460
+ 2025-05-08,22:27:05 | INFO | Train Epoch: 3 [ 62930944/128008192 (49%)] Data (t): 0.172 Batch (t): 6.406, 2560.11/s, 80.0036/s/gpu LR: 0.000041 Logit Scale: 94.498 Class_loss: 4.7704 (4.8542) Contrastive_loss: 0.90774 (0.64310) Loss: 5.6781 (5.4973)
461
+ 2025-05-08,22:40:47 | INFO | Train Epoch: 3 [ 65028096/128008192 (51%)] Data (t): 0.171 Batch (t): 6.425, 2559.41/s, 79.9816/s/gpu LR: 0.000038 Logit Scale: 94.595 Class_loss: 4.7880 (4.8522) Contrastive_loss: 0.61651 (0.64227) Loss: 5.4045 (5.4944)
462
+ 2025-05-08,22:42:51 | WARNING | Handling webdataset error (OSError('image file is truncated (92 bytes not processed)')). Ignoring.
463
+ 2025-05-08,22:54:27 | INFO | Train Epoch: 3 [ 67125248/128008192 (52%)] Data (t): 0.171 Batch (t): 6.404, 2558.75/s, 79.9609/s/gpu LR: 0.000036 Logit Scale: 94.686 Class_loss: 4.7788 (4.8500) Contrastive_loss: 0.72898 (0.64490) Loss: 5.5077 (5.4948)
464
+ 2025-05-08,23:08:09 | INFO | Train Epoch: 3 [ 69222400/128008192 (54%)] Data (t): 0.172 Batch (t): 6.424, 2519.33/s, 78.7292/s/gpu LR: 0.000033 Logit Scale: 94.791 Class_loss: 4.8959 (4.8513) Contrastive_loss: 0.62447 (0.64430) Loss: 5.5204 (5.4956)
465
+ 2025-05-08,23:12:23 | WARNING | Handling webdataset error (OSError('image file is truncated (123 bytes not processed)')). Ignoring.
466
+ 2025-05-08,23:21:53 | INFO | Train Epoch: 3 [ 71319552/128008192 (56%)] Data (t): 0.172 Batch (t): 6.435, 2555.86/s, 79.8708/s/gpu LR: 0.000031 Logit Scale: 94.898 Class_loss: 4.8935 (4.8525) Contrastive_loss: 0.41942 (0.63787) Loss: 5.3130 (5.4904)
467
+ 2025-05-08,23:35:33 | INFO | Train Epoch: 3 [ 73416704/128008192 (57%)] Data (t): 0.172 Batch (t): 6.406, 2554.53/s, 79.8292/s/gpu LR: 0.000029 Logit Scale: 94.944 Class_loss: 4.8250 (4.8517) Contrastive_loss: 0.59713 (0.63674) Loss: 5.4221 (5.4885)
468
+ 2025-05-08,23:49:13 | INFO | Train Epoch: 3 [ 75513856/128008192 (59%)] Data (t): 0.173 Batch (t): 6.405, 2561.45/s, 80.0453/s/gpu LR: 0.000027 Logit Scale: 95.030 Class_loss: 4.8647 (4.8521) Contrastive_loss: 0.67720 (0.63783) Loss: 5.5419 (5.4899)
469
+ 2025-05-08,23:56:29 | WARNING | Handling webdataset error (OSError('image file is truncated (27 bytes not processed)')). Ignoring.
470
+ 2025-05-09,00:02:54 | INFO | Train Epoch: 3 [ 77611008/128008192 (61%)] Data (t): 0.171 Batch (t): 6.416, 2546.04/s, 79.5639/s/gpu LR: 0.000025 Logit Scale: 95.103 Class_loss: 4.8493 (4.8520) Contrastive_loss: 0.38189 (0.63110) Loss: 5.2312 (5.4831)
471
+ 2025-05-09,00:16:35 | INFO | Train Epoch: 3 [ 79708160/128008192 (62%)] Data (t): 0.171 Batch (t): 6.418, 2431.38/s, 75.9806/s/gpu LR: 0.000023 Logit Scale: 95.193 Class_loss: 4.8601 (4.8522) Contrastive_loss: 0.61590 (0.63071) Loss: 5.4761 (5.4829)
472
+ 2025-05-09,00:30:16 | INFO | Train Epoch: 3 [ 81805312/128008192 (64%)] Data (t): 0.173 Batch (t): 6.410, 2543.84/s, 79.4949/s/gpu LR: 0.000021 Logit Scale: 95.259 Class_loss: 4.7957 (4.8508) Contrastive_loss: 0.71503 (0.63282) Loss: 5.5108 (5.4836)
473
+ 2025-05-09,00:31:15 | WARNING | Handling webdataset error (OSError('image file is truncated (2 bytes not processed)')). Ignoring.
474
+ 2025-05-09,00:38:41 | WARNING | Handling webdataset error (OSError('image file is truncated (7 bytes not processed)')). Ignoring.
475
+ 2025-05-09,00:44:02 | INFO | Train Epoch: 3 [ 83902464/128008192 (66%)] Data (t): 0.173 Batch (t): 6.458, 2537.66/s, 79.3020/s/gpu LR: 0.000019 Logit Scale: 95.342 Class_loss: 4.7997 (4.8496) Contrastive_loss: 0.66730 (0.63366) Loss: 5.4670 (5.4832)
476
+ 2025-05-09,00:51:29 | WARNING | Handling webdataset error (OSError('image file is truncated (92 bytes not processed)')). Ignoring.
477
+ 2025-05-09,00:57:49 | INFO | Train Epoch: 3 [ 85999616/128008192 (67%)] Data (t): 0.173 Batch (t): 6.459, 2537.13/s, 79.2855/s/gpu LR: 0.000017 Logit Scale: 95.418 Class_loss: 4.7870 (4.8481) Contrastive_loss: 0.61783 (0.63328) Loss: 5.4049 (5.4814)
478
+ 2025-05-09,01:11:36 | INFO | Train Epoch: 3 [ 88096768/128008192 (69%)] Data (t): 0.173 Batch (t): 6.461, 2534.27/s, 79.1959/s/gpu LR: 0.000015 Logit Scale: 95.481 Class_loss: 4.8524 (4.8482) Contrastive_loss: 0.53229 (0.63093) Loss: 5.3847 (5.4791)
479
+ 2025-05-09,01:25:23 | INFO | Train Epoch: 3 [ 90193920/128008192 (70%)] Data (t): 0.172 Batch (t): 6.462, 2536.56/s, 79.2675/s/gpu LR: 0.000014 Logit Scale: 95.544 Class_loss: 4.8644 (4.8486) Contrastive_loss: 0.38095 (0.62525) Loss: 5.2453 (5.4738)
480
+ 2025-05-09,01:39:10 | INFO | Train Epoch: 3 [ 92291072/128008192 (72%)] Data (t): 0.173 Batch (t): 6.462, 2534.97/s, 79.2179/s/gpu LR: 0.000012 Logit Scale: 95.603 Class_loss: 4.7671 (4.8467) Contrastive_loss: 0.61982 (0.62513) Loss: 5.3870 (5.4719)
481
+ 2025-05-09,01:52:58 | INFO | Train Epoch: 3 [ 94388224/128008192 (74%)] Data (t): 0.172 Batch (t): 6.462, 2535.10/s, 79.2219/s/gpu LR: 0.000011 Logit Scale: 95.649 Class_loss: 4.7616 (4.8449) Contrastive_loss: 0.61123 (0.62483) Loss: 5.3728 (5.4697)
482
+ 2025-05-09,02:06:45 | INFO | Train Epoch: 3 [ 96485376/128008192 (75%)] Data (t): 0.173 Batch (t): 6.461, 2529.12/s, 79.0350/s/gpu LR: 0.000010 Logit Scale: 95.693 Class_loss: 4.7906 (4.8437) Contrastive_loss: 0.64479 (0.62525) Loss: 5.4354 (5.4690)
483
+ 2025-05-09,02:20:32 | INFO | Train Epoch: 3 [ 98582528/128008192 (77%)] Data (t): 0.173 Batch (t): 6.461, 2534.81/s, 79.2127/s/gpu LR: 0.000008 Logit Scale: 95.738 Class_loss: 4.7863 (4.8425) Contrastive_loss: 0.52390 (0.62314) Loss: 5.3102 (5.4657)
484
+ 2025-05-09,02:34:16 | INFO | Train Epoch: 3 [100679680/128008192 (79%)] Data (t): 0.173 Batch (t): 6.438, 2551.20/s, 79.7249/s/gpu LR: 0.000007 Logit Scale: 95.767 Class_loss: 4.7517 (4.8407) Contrastive_loss: 0.71686 (0.62505) Loss: 5.4686 (5.4657)
485
+ 2025-05-09,02:47:56 | INFO | Train Epoch: 3 [102776832/128008192 (80%)] Data (t): 0.172 Batch (t): 6.407, 2560.28/s, 80.0089/s/gpu LR: 0.000006 Logit Scale: 95.790 Class_loss: 4.7787 (4.8394) Contrastive_loss: 0.56251 (0.62380) Loss: 5.3412 (5.4632)
486
+ 2025-05-09,03:01:36 | INFO | Train Epoch: 3 [104873984/128008192 (82%)] Data (t): 0.172 Batch (t): 6.405, 2560.70/s, 80.0219/s/gpu LR: 0.000005 Logit Scale: 95.817 Class_loss: 4.7619 (4.8379) Contrastive_loss: 0.63748 (0.62407) Loss: 5.3994 (5.4620)
487
+ 2025-05-09,03:15:16 | INFO | Train Epoch: 3 [106971136/128008192 (84%)] Data (t): 0.172 Batch (t): 6.407, 2558.22/s, 79.9443/s/gpu LR: 0.000004 Logit Scale: 95.828 Class_loss: 4.7743 (4.8367) Contrastive_loss: 0.63111 (0.62421) Loss: 5.4054 (5.4609)
488
+ 2025-05-09,03:28:58 | INFO | Train Epoch: 3 [109068288/128008192 (85%)] Data (t): 0.172 Batch (t): 6.424, 2533.91/s, 79.1848/s/gpu LR: 0.000003 Logit Scale: 95.843 Class_loss: 4.8463 (4.8369) Contrastive_loss: 0.45318 (0.62098) Loss: 5.2995 (5.4579)
489
+ 2025-05-09,03:42:39 | WARNING | Handling webdataset error (OSError('image file is truncated (58 bytes not processed)')). Ignoring.
490
+ 2025-05-09,03:42:45 | INFO | Train Epoch: 3 [111165440/128008192 (87%)] Data (t): 0.171 Batch (t): 6.461, 2538.75/s, 79.3359/s/gpu LR: 0.000003 Logit Scale: 95.854 Class_loss: 4.8775 (4.8376) Contrastive_loss: 0.42800 (0.61740) Loss: 5.3055 (5.4550)
491
+ 2025-05-09,03:56:32 | INFO | Train Epoch: 3 [113262592/128008192 (88%)] Data (t): 0.172 Batch (t): 6.462, 2538.90/s, 79.3406/s/gpu LR: 0.000002 Logit Scale: 95.862 Class_loss: 4.8906 (4.8386) Contrastive_loss: 0.52970 (0.61581) Loss: 5.4203 (5.4544)
492
+ 2025-05-09,04:10:19 | INFO | Train Epoch: 3 [115359744/128008192 (90%)] Data (t): 0.173 Batch (t): 6.460, 2535.02/s, 79.2193/s/gpu LR: 0.000002 Logit Scale: 95.868 Class_loss: 4.7699 (4.8374) Contrastive_loss: 0.71461 (0.61757) Loss: 5.4845 (5.4549)
493
+ 2025-05-09,04:12:37 | WARNING | Handling webdataset error (OSError('image file is truncated (6 bytes not processed)')). Ignoring.
494
+ 2025-05-09,04:24:18 | INFO | Train Epoch: 3 [117456896/128008192 (92%)] Data (t): 0.260 Batch (t): 6.557, 2539.27/s, 79.3522/s/gpu LR: 0.000001 Logit Scale: 95.871 Class_loss: 4.7771 (4.8363) Contrastive_loss: 0.61972 (0.61761) Loss: 5.3968 (5.4539)
495
+ 2025-05-09,04:38:07 | INFO | Train Epoch: 3 [119554048/128008192 (93%)] Data (t): 0.173 Batch (t): 6.473, 2531.70/s, 79.1157/s/gpu LR: 0.000001 Logit Scale: 95.873 Class_loss: 4.7592 (4.8350) Contrastive_loss: 0.68358 (0.61875) Loss: 5.4427 (5.4537)
496
+ 2025-05-09,04:51:54 | INFO | Train Epoch: 3 [121651200/128008192 (95%)] Data (t): 0.172 Batch (t): 6.462, 2535.13/s, 79.2228/s/gpu LR: 0.000000 Logit Scale: 95.873 Class_loss: 4.8213 (4.8347) Contrastive_loss: 0.61127 (0.61862) Loss: 5.4325 (5.4534)
497
+ 2025-05-09,05:05:17 | WARNING | Handling webdataset error (OSError('image file is truncated (31 bytes not processed)')). Ignoring.
498
+ 2025-05-09,05:05:41 | INFO | Train Epoch: 3 [123748352/128008192 (97%)] Data (t): 0.173 Batch (t): 6.460, 2536.31/s, 79.2596/s/gpu LR: 0.000000 Logit Scale: 95.873 Class_loss: 4.7594 (4.8335) Contrastive_loss: 0.68434 (0.61972) Loss: 5.4437 (5.4532)
499
+ 2025-05-09,05:19:28 | INFO | Train Epoch: 3 [125845504/128008192 (98%)] Data (t): 0.174 Batch (t): 6.459, 2536.43/s, 79.2635/s/gpu LR: 0.000000 Logit Scale: 95.873 Class_loss: 4.8925 (4.8345) Contrastive_loss: 0.52590 (0.61818) Loss: 5.4184 (5.4526)
500
+ 2025-05-09,05:33:14 | INFO | Train Epoch: 3 [127942656/128008192 (100%)] Data (t): 0.174 Batch (t): 6.460, 2534.29/s, 79.1965/s/gpu LR: 0.000000 Logit Scale: 95.873 Class_loss: 4.8458 (4.8346) Contrastive_loss: 0.60484 (0.61796) Loss: 5.4507 (5.4526)
501
+ 2025-05-09,05:33:40 | INFO | Train Epoch: 3 [128008192/128008192 (100%)] Data (t): 0.178 Batch (t): 6.450, 2540.58/s, 79.3931/s/gpu LR: 0.000000 Logit Scale: 95.873 Class_loss: 4.8815 (4.8354) Contrastive_loss: 0.62018 (0.61800) Loss: 5.5017 (5.4534)
502
+ 2025-05-09,05:33:53 | INFO | Starting zero-shot imagenet.
503
+ 2025-05-09,05:33:53 | INFO | Building zero-shot classifier
504
+ 2025-05-09,05:34:22 | INFO | Using classifier
clipcls_vit_l16_s512m_bs16k_mix0_0/params.txt ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ NDR_patch_size: 16
2
+ accum_freq: 1
3
+ aug_cfg: {}
4
+ batch_size: 512
5
+ beta1: 0.9
6
+ beta2: 0.98
7
+ checkpoint_path: ./logs-lr1e-3-datacomp/clipcls_vit_l16_s512m_bs16k_mix0_0/checkpoints
8
+ coca_caption_loss_weight: 2.0
9
+ coca_contrastive_loss_weight: 1.0
10
+ copy_codebase: False
11
+ csv_caption_key: title
12
+ csv_img_key: filepath
13
+ csv_separator:
14
+ dataset_resampled: False
15
+ dataset_type: webdataset
16
+ ddp_static_graph: True
17
+ debug: False
18
+ delete_prev_step_ckpt: True
19
+ delete_previous_checkpoint: False
20
+ device: cuda:0
21
+ dist_backend: nccl
22
+ dist_url: env://
23
+ distill: False
24
+ distill_model: None
25
+ distill_pretrained: None
26
+ distributed: True
27
+ epochs: 4
28
+ epochs_cooldown: None
29
+ eps: 1e-06
30
+ force_custom_text: False
31
+ force_image_size: 224
32
+ force_patch_dropout: None
33
+ force_quick_gelu: False
34
+ gather_with_grad: True
35
+ global_batch_size: 16384
36
+ grad_checkpointing: True
37
+ grad_clip_norm: None
38
+ horovod: False
39
+ image_interpolation: None
40
+ image_mean: None
41
+ image_resize_mode: None
42
+ image_std: None
43
+ imagenet_v2: None
44
+ imagenet_val: /mnt/bn/zilongdata-hl/dataset/imagenet/val
45
+ is_cls_token: True
46
+ local_loss: True
47
+ local_rank: 0
48
+ lock_image: False
49
+ lock_image_freeze_bn_stats: False
50
+ lock_image_unlocked_groups: 0
51
+ lock_text: False
52
+ lock_text_freeze_layer_norm: False
53
+ lock_text_unlocked_layers: 0
54
+ log_every_n_steps: 128
55
+ log_level: 20
56
+ log_local: False
57
+ log_path: ./logs-lr1e-3-datacomp/clipcls_vit_l16_s512m_bs16k_mix0_0/out.log
58
+ logs: ./logs-lr1e-3-datacomp
59
+ lr: 0.001
60
+ lr_cooldown_end: 0.0
61
+ lr_cooldown_power: 1.0
62
+ lr_scheduler: cosine
63
+ max_seq_len: 15000
64
+ model: CLIPCLS-ViT-L-16
65
+ name: clipcls_vit_l16_s512m_bs16k_mix0_0
66
+ native_dynamic_resolution: False
67
+ no_set_device_rank: False
68
+ only_packing: False
69
+ precision: amp
70
+ pretrained:
71
+ pretrained_image:
72
+ pretrained_text:
73
+ rank: 0
74
+ remote_sync: None
75
+ remote_sync_frequency: 300
76
+ remote_sync_protocol: s3
77
+ report_to: wandb
78
+ resume: None
79
+ rope_attn_num_heads: 12
80
+ rope_model_width: 768
81
+ save_every_n_steps: 6104
82
+ save_frequency: 1
83
+ save_most_recent: False
84
+ seed: 0
85
+ siglip: False
86
+ skip_scheduler: False
87
+ tensorboard: False
88
+ tensorboard_path:
89
+ torchcompile: False
90
+ torchscript: False
91
+ trace: False
92
+ train_data: /mnt/bn/zilongdata-hl/dataset/Recap-DataComp-1B-Dataset/{000000..140146}.tar
93
+ train_data_upsampling_factors: None
94
+ train_num_samples: 128000000
95
+ use_bn_sync: False
96
+ use_bnb_linear: None
97
+ val_data: None
98
+ val_frequency: 1
99
+ val_num_samples: None
100
+ val_steps: 0
101
+ wandb: True
102
+ wandb_notes:
103
+ wandb_project_name: cls-clip-NDR
104
+ warmup: 500
105
+ wd: 0.2
106
+ workers: 1
107
+ world_size: 32
108
+ zeroshot_frequency: 4
109
+ zeroshot_steps: 0