| nohup: ignoring input |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:70: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. |
| self.scaler = GradScaler() |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:116: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.embeddings = torch.load(combined_path, map_location=self.device) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:180: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.compressor.load_state_dict(torch.load('final_compressor_model.pth', map_location=self.device)) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:181: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.decompressor.load_state_dict(torch.load('final_decompressor_model.pth', map_location=self.device)) |
| /data2/edwardsun/flow_home/cfg_dataset.py:253: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.embeddings = torch.load(combined_path, map_location='cpu') |
| Starting optimized training with batch_size=512, epochs=1500 |
| Using GPU 0 for optimized H100 training |
| Mixed precision: True |
| Batch size: 512 |
| Target epochs: 1500 |
| Learning rate: 0.0016 -> 0.0008 |
| β Mixed precision training enabled (BF16) |
| Loading ALL AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
| Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt... |
| β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
| Computing preprocessing statistics... |
| β Statistics computed and saved: |
| Total embeddings: 17,968 |
| Mean: -0.0005 Β± 0.0897 |
| Std: 0.0869 Β± 0.1168 |
| Range: [-9.1738, 3.2894] |
| Initializing models... |
| β Model compiled with torch.compile for speedup |
| β Models initialized: |
| Compressor parameters: 78,817,360 |
| Decompressor parameters: 39,458,720 |
| Flow model parameters: 50,779,584 |
| Initializing datasets with FULL data... |
| Loading AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
| Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt (FULL DATA)... |
| β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
| Loading CFG data from FASTA: /home/edwardsun/flow/combined_final.fasta... |
| Parsing FASTA file: /home/edwardsun/flow/combined_final.fasta |
| Label assignment: >AP = AMP (0), >sp = Non-AMP (1) |
| β Parsed 6983 valid sequences from FASTA |
| AMP sequences: 3306 |
| Non-AMP sequences: 3677 |
| Masked for CFG: 698 |
| Loaded 6983 CFG sequences |
| Label distribution: [3306 3677] |
| Masked 698 labels for CFG training |
| Aligning AMP embeddings with CFG data... |
| Aligned 6983 samples |
| CFG Flow Dataset initialized: |
| AMP embeddings: torch.Size([17968, 50, 1280]) |
| CFG labels: 6983 |
| Aligned samples: 6983 |
| β Dataset initialized with FULL data: |
| Total samples: 6,983 |
| Batch size: 512 |
| Batches per epoch: 14 |
| Total training steps: 21,000 |
| Validation every: 5,000 steps |
| Initializing optimizer and scheduler... |
| β Optimizer initialized: |
| Base LR: 0.0016 |
| Min LR: 0.0008 |
| Warmup steps: 3000 |
| Weight decay: 0.01 |
| Gradient clip norm: 1.0 |
| β Optimized Single GPU training setup complete with FULL DATA! |
| π Starting Optimized Single GPU Flow Matching Training with FULL DATA |
| GPU: 0 |
| Total iterations: 1500 |
| Batch size: 512 |
| Total samples: 6,983 |
| Mixed precision: True |
| Estimated time: ~8-10 hours (overnight training with ALL data) |
| ============================================================ |
|
Training Flow Model: 0%| | 0/1500 [00:00<?, ?it/s]/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 1/1500 [00:47<19:58:38, 47.98s/it]Epoch 0 | Step 1/ 21000 | Loss: 2.348453 | LR: 1.60e-04 | Speed: 0.0 steps/s | ETA: 198.3h |
| Epoch 0 | Avg Loss: 1.154665 | LR: 1.67e-04 | Time: 48.0s | Samples: 6,983 |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 2/1500 [00:53<9:39:12, 23.20s/it] Epoch 1 | Step 15/ 21000 | Loss: 1.012731 | LR: 1.67e-04 | Speed: 0.3 steps/s | ETA: 19.5h |
| Epoch 1 | Avg Loss: 1.011483 | LR: 1.73e-04 | Time: 5.9s | Samples: 6,983 |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 3/1500 [00:57<5:51:46, 14.10s/it]Epoch 2 | Step 29/ 21000 | Loss: 1.011287 | LR: 1.74e-04 | Speed: 0.5 steps/s | ETA: 11.2h |
| Epoch 2 | Avg Loss: 1.008224 | LR: 1.80e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 0%| | 4/1500 [01:00<4:05:10, 9.83s/it]Epoch 3 | Step 43/ 21000 | Loss: 1.003925 | LR: 1.81e-04 | Speed: 0.7 steps/s | ETA: 8.0h |
| Epoch 3 | Avg Loss: 1.101865 | LR: 1.87e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 0%| | 5/1500 [01:03<3:06:38, 7.49s/it]Epoch 4 | Step 57/ 21000 | Loss: 1.003482 | LR: 1.87e-04 | Speed: 0.9 steps/s | ETA: 6.4h |
| Epoch 4 | Avg Loss: 1.002411 | LR: 1.94e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 0%| | 6/1500 [01:07<2:31:52, 6.10s/it]Epoch 5 | Step 71/ 21000 | Loss: 0.996259 | LR: 1.94e-04 | Speed: 1.1 steps/s | ETA: 5.4h |
| Epoch 5 | Avg Loss: 1.002186 | LR: 2.00e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 0%| | 7/1500 [01:10<2:10:05, 5.23s/it]Epoch 6 | Step 85/ 21000 | Loss: 0.976243 | LR: 2.01e-04 | Speed: 1.2 steps/s | ETA: 4.7h |
| Epoch 6 | Avg Loss: 0.928040 | LR: 2.07e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%| | 8/1500 [01:14<1:56:10, 4.67s/it]Epoch 7 | Step 99/ 21000 | Loss: 0.834150 | LR: 2.08e-04 | Speed: 1.4 steps/s | ETA: 4.3h |
| Epoch 7 | Avg Loss: 0.764061 | LR: 2.14e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 1%| | 9/1500 [01:17<1:46:09, 4.27s/it]Epoch 8 | Step 113/ 21000 | Loss: 0.666364 | LR: 2.14e-04 | Speed: 1.5 steps/s | ETA: 3.9h |
| Epoch 8 | Avg Loss: 0.599929 | LR: 2.20e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%| | 10/1500 [01:20<1:38:41, 3.97s/it]Epoch 9 | Step 127/ 21000 | Loss: 0.514230 | LR: 2.21e-04 | Speed: 1.6 steps/s | ETA: 3.6h |
| Epoch 9 | Avg Loss: 0.464759 | LR: 2.27e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%| | 11/1500 [01:24<1:33:37, 3.77s/it]Epoch 10 | Step 141/ 21000 | Loss: 0.395490 | LR: 2.28e-04 | Speed: 1.7 steps/s | ETA: 3.4h |
| Epoch 10 | Avg Loss: 0.357536 | LR: 2.34e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%| | 12/1500 [01:27<1:31:07, 3.67s/it]Epoch 11 | Step 155/ 21000 | Loss: 0.300763 | LR: 2.34e-04 | Speed: 1.8 steps/s | ETA: 3.2h |
| Epoch 11 | Avg Loss: 0.273293 | LR: 2.41e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%| | 13/1500 [01:30<1:28:33, 3.57s/it]Epoch 12 | Step 169/ 21000 | Loss: 0.225660 | LR: 2.41e-04 | Speed: 1.9 steps/s | ETA: 3.1h |
| Epoch 12 | Avg Loss: 0.213443 | LR: 2.47e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%| | 14/1500 [01:34<1:27:28, 3.53s/it]Epoch 13 | Step 183/ 21000 | Loss: 0.195754 | LR: 2.48e-04 | Speed: 2.0 steps/s | ETA: 2.9h |
| Epoch 13 | Avg Loss: 0.170437 | LR: 2.54e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%| | 15/1500 [01:37<1:25:31, 3.46s/it]Epoch 14 | Step 197/ 21000 | Loss: 0.187381 | LR: 2.55e-04 | Speed: 2.0 steps/s | ETA: 2.8h |
| Epoch 14 | Avg Loss: 0.144519 | LR: 2.61e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%| | 16/1500 [01:40<1:24:28, 3.42s/it]Epoch 15 | Step 211/ 21000 | Loss: 0.138201 | LR: 2.61e-04 | Speed: 2.1 steps/s | ETA: 2.7h |
| Epoch 15 | Avg Loss: 0.123737 | LR: 2.68e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%| | 17/1500 [01:44<1:24:32, 3.42s/it]Epoch 16 | Step 225/ 21000 | Loss: 0.111913 | LR: 2.68e-04 | Speed: 2.2 steps/s | ETA: 2.6h |
| Epoch 16 | Avg Loss: 0.116430 | LR: 2.74e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%| | 18/1500 [01:47<1:24:34, 3.42s/it]Epoch 17 | Step 239/ 21000 | Loss: 0.113527 | LR: 2.75e-04 | Speed: 2.2 steps/s | ETA: 2.6h |
| Epoch 17 | Avg Loss: 0.110281 | LR: 2.81e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 1%|β | 19/1500 [01:51<1:23:57, 3.40s/it]Epoch 18 | Step 253/ 21000 | Loss: 0.089264 | LR: 2.81e-04 | Speed: 2.3 steps/s | ETA: 2.5h |
| Epoch 18 | Avg Loss: 0.097118 | LR: 2.88e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%|β | 20/1500 [01:54<1:22:43, 3.35s/it]Epoch 19 | Step 267/ 21000 | Loss: 0.101127 | LR: 2.88e-04 | Speed: 2.4 steps/s | ETA: 2.4h |
| Epoch 19 | Avg Loss: 0.098588 | LR: 2.94e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 1%|β | 21/1500 [01:57<1:21:58, 3.33s/it]Epoch 20 | Step 281/ 21000 | Loss: 0.093898 | LR: 2.95e-04 | Speed: 2.4 steps/s | ETA: 2.4h |
| Epoch 20 | Avg Loss: 0.097397 | LR: 3.01e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 1%|β | 22/1500 [02:00<1:21:45, 3.32s/it]Epoch 21 | Step 295/ 21000 | Loss: 0.099576 | LR: 3.02e-04 | Speed: 2.5 steps/s | ETA: 2.3h |
| Epoch 21 | Avg Loss: 0.094745 | LR: 3.08e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 23/1500 [02:04<1:23:02, 3.37s/it]Epoch 22 | Step 309/ 21000 | Loss: 0.093006 | LR: 3.08e-04 | Speed: 2.5 steps/s | ETA: 2.3h |
| Epoch 22 | Avg Loss: 0.089745 | LR: 3.15e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 24/1500 [02:07<1:22:00, 3.33s/it]Epoch 23 | Step 323/ 21000 | Loss: 0.081982 | LR: 3.15e-04 | Speed: 2.6 steps/s | ETA: 2.2h |
| Epoch 23 | Avg Loss: 0.082143 | LR: 3.21e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 25/1500 [02:11<1:22:49, 3.37s/it]Epoch 24 | Step 337/ 21000 | Loss: 0.080047 | LR: 3.22e-04 | Speed: 2.6 steps/s | ETA: 2.2h |
| Epoch 24 | Avg Loss: 0.087128 | LR: 3.28e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 26/1500 [02:14<1:22:35, 3.36s/it]Epoch 25 | Step 351/ 21000 | Loss: 0.088592 | LR: 3.28e-04 | Speed: 2.6 steps/s | ETA: 2.2h |
| Epoch 25 | Avg Loss: 0.078387 | LR: 3.35e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 27/1500 [02:17<1:22:15, 3.35s/it]Epoch 26 | Step 365/ 21000 | Loss: 0.085473 | LR: 3.35e-04 | Speed: 2.7 steps/s | ETA: 2.1h |
| Epoch 26 | Avg Loss: 0.073784 | LR: 3.41e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 28/1500 [02:21<1:22:12, 3.35s/it]Epoch 27 | Step 379/ 21000 | Loss: 0.070519 | LR: 3.42e-04 | Speed: 2.7 steps/s | ETA: 2.1h |
| Epoch 27 | Avg Loss: 0.077967 | LR: 3.48e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 29/1500 [02:24<1:22:28, 3.36s/it]Epoch 28 | Step 393/ 21000 | Loss: 0.063307 | LR: 3.49e-04 | Speed: 2.7 steps/s | ETA: 2.1h |
| Epoch 28 | Avg Loss: 0.074035 | LR: 3.55e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 30/1500 [02:27<1:21:47, 3.34s/it]Epoch 29 | Step 407/ 21000 | Loss: 0.073874 | LR: 3.55e-04 | Speed: 2.8 steps/s | ETA: 2.1h |
| Epoch 29 | Avg Loss: 0.073863 | LR: 3.62e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 31/1500 [02:31<1:22:13, 3.36s/it]Epoch 30 | Step 421/ 21000 | Loss: 0.075808 | LR: 3.62e-04 | Speed: 2.8 steps/s | ETA: 2.0h |
| Epoch 30 | Avg Loss: 0.068544 | LR: 3.68e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 32/1500 [02:34<1:22:36, 3.38s/it]Epoch 31 | Step 435/ 21000 | Loss: 0.072362 | LR: 3.69e-04 | Speed: 2.8 steps/s | ETA: 2.0h |
| Epoch 31 | Avg Loss: 0.072260 | LR: 3.75e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 33/1500 [02:37<1:21:59, 3.35s/it]Epoch 32 | Step 449/ 21000 | Loss: 0.071200 | LR: 3.76e-04 | Speed: 2.9 steps/s | ETA: 2.0h |
| Epoch 32 | Avg Loss: 0.068997 | LR: 3.82e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 34/1500 [02:41<1:22:07, 3.36s/it]Epoch 33 | Step 463/ 21000 | Loss: 0.070137 | LR: 3.82e-04 | Speed: 2.9 steps/s | ETA: 2.0h |
| Epoch 33 | Avg Loss: 0.068473 | LR: 3.88e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 35/1500 [02:44<1:21:46, 3.35s/it]Epoch 34 | Step 477/ 21000 | Loss: 0.061482 | LR: 3.89e-04 | Speed: 2.9 steps/s | ETA: 2.0h |
| Epoch 34 | Avg Loss: 0.066994 | LR: 3.95e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 36/1500 [02:48<1:22:12, 3.37s/it]Epoch 35 | Step 491/ 21000 | Loss: 0.074752 | LR: 3.96e-04 | Speed: 2.9 steps/s | ETA: 1.9h |
| Epoch 35 | Avg Loss: 0.070879 | LR: 4.02e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 2%|β | 37/1500 [02:51<1:21:20, 3.34s/it]Epoch 36 | Step 505/ 21000 | Loss: 0.057749 | LR: 4.02e-04 | Speed: 3.0 steps/s | ETA: 1.9h |
| Epoch 36 | Avg Loss: 0.057021 | LR: 4.09e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 38/1500 [02:54<1:21:50, 3.36s/it]Epoch 37 | Step 519/ 21000 | Loss: 0.061772 | LR: 4.09e-04 | Speed: 3.0 steps/s | ETA: 1.9h |
| Epoch 37 | Avg Loss: 0.057016 | LR: 4.15e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 39/1500 [02:57<1:20:50, 3.32s/it]Epoch 38 | Step 533/ 21000 | Loss: 0.055348 | LR: 4.16e-04 | Speed: 3.0 steps/s | ETA: 1.9h |
| Epoch 38 | Avg Loss: 0.062512 | LR: 4.22e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 40/1500 [03:01<1:21:38, 3.36s/it]Epoch 39 | Step 547/ 21000 | Loss: 0.062446 | LR: 4.23e-04 | Speed: 3.0 steps/s | ETA: 1.9h |
| Epoch 39 | Avg Loss: 0.055946 | LR: 4.29e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 41/1500 [03:04<1:21:52, 3.37s/it]Epoch 40 | Step 561/ 21000 | Loss: 0.055766 | LR: 4.29e-04 | Speed: 3.1 steps/s | ETA: 1.9h |
| Epoch 40 | Avg Loss: 0.052785 | LR: 4.36e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 42/1500 [03:08<1:21:38, 3.36s/it]Epoch 41 | Step 575/ 21000 | Loss: 0.066500 | LR: 4.36e-04 | Speed: 3.1 steps/s | ETA: 1.8h |
| Epoch 41 | Avg Loss: 0.058182 | LR: 4.42e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 43/1500 [03:11<1:21:14, 3.35s/it]Epoch 42 | Step 589/ 21000 | Loss: 0.058983 | LR: 4.43e-04 | Speed: 3.1 steps/s | ETA: 1.8h |
| Epoch 42 | Avg Loss: 0.054515 | LR: 4.49e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 44/1500 [03:14<1:21:09, 3.34s/it]Epoch 43 | Step 603/ 21000 | Loss: 0.050246 | LR: 4.49e-04 | Speed: 3.1 steps/s | ETA: 1.8h |
| Epoch 43 | Avg Loss: 0.055557 | LR: 4.56e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 45/1500 [03:18<1:21:28, 3.36s/it]Epoch 44 | Step 617/ 21000 | Loss: 0.058090 | LR: 4.56e-04 | Speed: 3.1 steps/s | ETA: 1.8h |
| Epoch 44 | Avg Loss: 0.053506 | LR: 4.62e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 46/1500 [03:21<1:21:37, 3.37s/it]Epoch 45 | Step 631/ 21000 | Loss: 0.050240 | LR: 4.63e-04 | Speed: 3.2 steps/s | ETA: 1.8h |
| Epoch 45 | Avg Loss: 0.053557 | LR: 4.69e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 47/1500 [03:24<1:21:55, 3.38s/it]Epoch 46 | Step 645/ 21000 | Loss: 0.056234 | LR: 4.70e-04 | Speed: 3.2 steps/s | ETA: 1.8h |
| Epoch 46 | Avg Loss: 0.050458 | LR: 4.76e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 48/1500 [03:28<1:21:08, 3.35s/it]Epoch 47 | Step 659/ 21000 | Loss: 0.050834 | LR: 4.76e-04 | Speed: 3.2 steps/s | ETA: 1.8h |
| Epoch 47 | Avg Loss: 0.053890 | LR: 4.83e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 49/1500 [03:31<1:21:41, 3.38s/it]Epoch 48 | Step 673/ 21000 | Loss: 0.043349 | LR: 4.83e-04 | Speed: 3.2 steps/s | ETA: 1.8h |
| Epoch 48 | Avg Loss: 0.049642 | LR: 4.89e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 50/1500 [03:34<1:20:26, 3.33s/it]Epoch 49 | Step 687/ 21000 | Loss: 0.055819 | LR: 4.90e-04 | Speed: 3.2 steps/s | ETA: 1.8h |
| Epoch 49 | Avg Loss: 0.051639 | LR: 4.96e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 51/1500 [03:38<1:19:57, 3.31s/it]Epoch 50 | Step 701/ 21000 | Loss: 0.048318 | LR: 4.96e-04 | Speed: 3.2 steps/s | ETA: 1.7h |
| Epoch 50 | Avg Loss: 0.051892 | LR: 5.03e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 3%|β | 52/1500 [03:41<1:19:38, 3.30s/it]Epoch 51 | Step 715/ 21000 | Loss: 0.037495 | LR: 5.03e-04 | Speed: 3.2 steps/s | ETA: 1.7h |
| Epoch 51 | Avg Loss: 0.043957 | LR: 5.09e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 53/1500 [03:44<1:19:25, 3.29s/it]Epoch 52 | Step 729/ 21000 | Loss: 0.041565 | LR: 5.10e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 52 | Avg Loss: 0.049808 | LR: 5.16e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 54/1500 [03:47<1:19:11, 3.29s/it]Epoch 53 | Step 743/ 21000 | Loss: 0.051807 | LR: 5.17e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 53 | Avg Loss: 0.059701 | LR: 5.23e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 55/1500 [03:51<1:20:01, 3.32s/it]Epoch 54 | Step 757/ 21000 | Loss: 0.050800 | LR: 5.23e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 54 | Avg Loss: 0.050593 | LR: 5.30e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 56/1500 [03:54<1:21:36, 3.39s/it]Epoch 55 | Step 771/ 21000 | Loss: 0.057050 | LR: 5.30e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 55 | Avg Loss: 0.051511 | LR: 5.36e-04 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 57/1500 [03:58<1:21:33, 3.39s/it]Epoch 56 | Step 785/ 21000 | Loss: 0.056579 | LR: 5.37e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 56 | Avg Loss: 0.047250 | LR: 5.43e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 58/1500 [04:01<1:20:45, 3.36s/it]Epoch 57 | Step 799/ 21000 | Loss: 0.050855 | LR: 5.44e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 57 | Avg Loss: 0.053542 | LR: 5.50e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 59/1500 [04:05<1:20:58, 3.37s/it]Epoch 58 | Step 813/ 21000 | Loss: 0.049122 | LR: 5.50e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 58 | Avg Loss: 0.048603 | LR: 5.56e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 60/1500 [04:08<1:20:57, 3.37s/it]Epoch 59 | Step 827/ 21000 | Loss: 0.051872 | LR: 5.57e-04 | Speed: 3.3 steps/s | ETA: 1.7h |
| Epoch 59 | Avg Loss: 0.044602 | LR: 5.63e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 61/1500 [04:11<1:21:36, 3.40s/it]Epoch 60 | Step 841/ 21000 | Loss: 0.044589 | LR: 5.64e-04 | Speed: 3.4 steps/s | ETA: 1.7h |
| Epoch 60 | Avg Loss: 0.046526 | LR: 5.70e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 62/1500 [04:15<1:21:03, 3.38s/it]Epoch 61 | Step 855/ 21000 | Loss: 0.041265 | LR: 5.70e-04 | Speed: 3.4 steps/s | ETA: 1.7h |
| Epoch 61 | Avg Loss: 0.051947 | LR: 5.77e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 63/1500 [04:18<1:19:54, 3.34s/it]Epoch 62 | Step 869/ 21000 | Loss: 0.046667 | LR: 5.77e-04 | Speed: 3.4 steps/s | ETA: 1.7h |
| Epoch 62 | Avg Loss: 0.060930 | LR: 5.83e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 64/1500 [04:21<1:20:41, 3.37s/it]Epoch 63 | Step 883/ 21000 | Loss: 0.052620 | LR: 5.84e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 63 | Avg Loss: 0.050808 | LR: 5.90e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 65/1500 [04:25<1:20:27, 3.36s/it]Epoch 64 | Step 897/ 21000 | Loss: 0.057084 | LR: 5.91e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 64 | Avg Loss: 0.056893 | LR: 5.97e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 66/1500 [04:28<1:20:06, 3.35s/it]Epoch 65 | Step 911/ 21000 | Loss: 0.043265 | LR: 5.97e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 65 | Avg Loss: 0.046618 | LR: 6.04e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 4%|β | 67/1500 [04:31<1:20:12, 3.36s/it]Epoch 66 | Step 925/ 21000 | Loss: 0.051255 | LR: 6.04e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 66 | Avg Loss: 0.047629 | LR: 6.10e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 68/1500 [04:35<1:20:10, 3.36s/it]Epoch 67 | Step 939/ 21000 | Loss: 0.059373 | LR: 6.11e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 67 | Avg Loss: 0.049539 | LR: 6.17e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 69/1500 [04:38<1:19:39, 3.34s/it]Epoch 68 | Step 953/ 21000 | Loss: 0.052057 | LR: 6.17e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 68 | Avg Loss: 0.045869 | LR: 6.24e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 70/1500 [04:41<1:19:07, 3.32s/it]Epoch 69 | Step 967/ 21000 | Loss: 0.042184 | LR: 6.24e-04 | Speed: 3.4 steps/s | ETA: 1.6h |
| Epoch 69 | Avg Loss: 0.046375 | LR: 6.30e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 71/1500 [04:45<1:18:48, 3.31s/it]Epoch 70 | Step 981/ 21000 | Loss: 0.037335 | LR: 6.31e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 70 | Avg Loss: 0.044112 | LR: 6.37e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 72/1500 [04:48<1:19:08, 3.33s/it]Epoch 71 | Step 995/ 21000 | Loss: 0.040643 | LR: 6.38e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 71 | Avg Loss: 0.040547 | LR: 6.44e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 73/1500 [04:51<1:19:13, 3.33s/it]Epoch 72 | Step 1009/ 21000 | Loss: 0.040631 | LR: 6.44e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 72 | Avg Loss: 0.043072 | LR: 6.51e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 74/1500 [04:55<1:18:22, 3.30s/it]Epoch 73 | Step 1023/ 21000 | Loss: 0.052481 | LR: 6.51e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 73 | Avg Loss: 0.054466 | LR: 6.57e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 75/1500 [04:58<1:18:17, 3.30s/it]Epoch 74 | Step 1037/ 21000 | Loss: 0.059234 | LR: 6.58e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 74 | Avg Loss: 0.056152 | LR: 6.64e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 76/1500 [05:01<1:18:00, 3.29s/it]Epoch 75 | Step 1051/ 21000 | Loss: 0.049168 | LR: 6.64e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 75 | Avg Loss: 0.043867 | LR: 6.71e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 77/1500 [05:04<1:18:14, 3.30s/it]Epoch 76 | Step 1065/ 21000 | Loss: 0.044122 | LR: 6.71e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 76 | Avg Loss: 0.046304 | LR: 6.77e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 78/1500 [05:08<1:18:42, 3.32s/it]Epoch 77 | Step 1079/ 21000 | Loss: 0.043278 | LR: 6.78e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 77 | Avg Loss: 0.044190 | LR: 6.84e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 79/1500 [05:11<1:18:26, 3.31s/it]Epoch 78 | Step 1093/ 21000 | Loss: 0.045077 | LR: 6.85e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 78 | Avg Loss: 0.039687 | LR: 6.91e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 80/1500 [05:14<1:18:05, 3.30s/it]Epoch 79 | Step 1107/ 21000 | Loss: 0.043229 | LR: 6.91e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 79 | Avg Loss: 0.045789 | LR: 6.98e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 81/1500 [05:18<1:17:51, 3.29s/it]Epoch 80 | Step 1121/ 21000 | Loss: 0.045537 | LR: 6.98e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 80 | Avg Loss: 0.048807 | LR: 7.04e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 5%|β | 82/1500 [05:21<1:18:22, 3.32s/it]Epoch 81 | Step 1135/ 21000 | Loss: 0.047845 | LR: 7.05e-04 | Speed: 3.5 steps/s | ETA: 1.6h |
| Epoch 81 | Avg Loss: 0.046174 | LR: 7.11e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 83/1500 [05:24<1:18:18, 3.32s/it]Epoch 82 | Step 1149/ 21000 | Loss: 0.042759 | LR: 7.12e-04 | Speed: 3.6 steps/s | ETA: 1.6h |
| Epoch 82 | Avg Loss: 0.040676 | LR: 7.18e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 84/1500 [05:28<1:18:42, 3.34s/it]Epoch 83 | Step 1163/ 21000 | Loss: 0.031483 | LR: 7.18e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 83 | Avg Loss: 0.039434 | LR: 7.24e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 85/1500 [05:31<1:19:31, 3.37s/it]Epoch 84 | Step 1177/ 21000 | Loss: 0.043087 | LR: 7.25e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 84 | Avg Loss: 0.053384 | LR: 7.31e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 86/1500 [05:35<1:19:46, 3.38s/it]Epoch 85 | Step 1191/ 21000 | Loss: 0.045207 | LR: 7.32e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 85 | Avg Loss: 0.047736 | LR: 7.38e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 87/1500 [05:38<1:20:30, 3.42s/it]Epoch 86 | Step 1205/ 21000 | Loss: 0.038893 | LR: 7.38e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 86 | Avg Loss: 0.044388 | LR: 7.45e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 88/1500 [05:41<1:19:23, 3.37s/it]Epoch 87 | Step 1219/ 21000 | Loss: 0.036651 | LR: 7.45e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 87 | Avg Loss: 0.043202 | LR: 7.51e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 89/1500 [05:45<1:18:59, 3.36s/it]Epoch 88 | Step 1233/ 21000 | Loss: 0.049969 | LR: 7.52e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 88 | Avg Loss: 0.039903 | LR: 7.58e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 90/1500 [05:48<1:19:05, 3.37s/it]Epoch 89 | Step 1247/ 21000 | Loss: 0.059831 | LR: 7.59e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 89 | Avg Loss: 0.049894 | LR: 7.65e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 91/1500 [05:52<1:19:42, 3.39s/it]Epoch 90 | Step 1261/ 21000 | Loss: 0.057150 | LR: 7.65e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 90 | Avg Loss: 0.054641 | LR: 7.72e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 92/1500 [05:55<1:19:05, 3.37s/it]Epoch 91 | Step 1275/ 21000 | Loss: 0.056272 | LR: 7.72e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 91 | Avg Loss: 0.060308 | LR: 7.78e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 93/1500 [05:58<1:19:07, 3.37s/it]Epoch 92 | Step 1289/ 21000 | Loss: 0.054234 | LR: 7.79e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 92 | Avg Loss: 0.051279 | LR: 7.85e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 94/1500 [06:02<1:18:50, 3.36s/it]Epoch 93 | Step 1303/ 21000 | Loss: 0.038766 | LR: 7.85e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 93 | Avg Loss: 0.040706 | LR: 7.92e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 95/1500 [06:05<1:18:41, 3.36s/it]Epoch 94 | Step 1317/ 21000 | Loss: 0.034445 | LR: 7.92e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 94 | Avg Loss: 0.037526 | LR: 7.98e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 96/1500 [06:08<1:19:09, 3.38s/it]Epoch 95 | Step 1331/ 21000 | Loss: 0.042873 | LR: 7.99e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 95 | Avg Loss: 0.040381 | LR: 8.05e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 6%|β | 97/1500 [06:12<1:18:55, 3.38s/it]Epoch 96 | Step 1345/ 21000 | Loss: 0.033452 | LR: 8.06e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 96 | Avg Loss: 0.044442 | LR: 8.12e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 98/1500 [06:15<1:18:20, 3.35s/it]Epoch 97 | Step 1359/ 21000 | Loss: 0.049492 | LR: 8.12e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 97 | Avg Loss: 0.049140 | LR: 8.19e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 99/1500 [06:18<1:18:08, 3.35s/it]Epoch 98 | Step 1373/ 21000 | Loss: 0.036935 | LR: 8.19e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 98 | Avg Loss: 0.039066 | LR: 8.25e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 100/1500 [06:22<1:17:55, 3.34s/it]Epoch 99 | Step 1387/ 21000 | Loss: 0.034987 | LR: 8.26e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 99 | Avg Loss: 0.035230 | LR: 8.32e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 101/1500 [06:25<1:17:34, 3.33s/it]Epoch 100 | Step 1401/ 21000 | Loss: 0.037965 | LR: 8.32e-04 | Speed: 3.6 steps/s | ETA: 1.5h |
| Epoch 100 | Avg Loss: 0.039429 | LR: 8.39e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 102/1500 [06:28<1:17:25, 3.32s/it]Epoch 101 | Step 1415/ 21000 | Loss: 0.044940 | LR: 8.39e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 101 | Avg Loss: 0.043091 | LR: 8.45e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 103/1500 [06:32<1:17:17, 3.32s/it]Epoch 102 | Step 1429/ 21000 | Loss: 0.046929 | LR: 8.46e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 102 | Avg Loss: 0.045039 | LR: 8.52e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 104/1500 [06:35<1:18:00, 3.35s/it]Epoch 103 | Step 1443/ 21000 | Loss: 0.040008 | LR: 8.53e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 103 | Avg Loss: 0.035983 | LR: 8.59e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 105/1500 [06:39<1:19:01, 3.40s/it]Epoch 104 | Step 1457/ 21000 | Loss: 0.037500 | LR: 8.59e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 104 | Avg Loss: 0.035712 | LR: 8.66e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 106/1500 [06:42<1:18:31, 3.38s/it]Epoch 105 | Step 1471/ 21000 | Loss: 0.038647 | LR: 8.66e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 105 | Avg Loss: 0.037083 | LR: 8.72e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 107/1500 [06:45<1:18:31, 3.38s/it]Epoch 106 | Step 1485/ 21000 | Loss: 0.032988 | LR: 8.73e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 106 | Avg Loss: 0.034224 | LR: 8.79e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 108/1500 [06:49<1:18:06, 3.37s/it]Epoch 107 | Step 1499/ 21000 | Loss: 0.041273 | LR: 8.80e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 107 | Avg Loss: 0.054691 | LR: 8.86e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 109/1500 [06:52<1:18:04, 3.37s/it]Epoch 108 | Step 1513/ 21000 | Loss: 0.066792 | LR: 8.86e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 108 | Avg Loss: 0.056016 | LR: 8.92e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 110/1500 [06:55<1:17:38, 3.35s/it]Epoch 109 | Step 1527/ 21000 | Loss: 0.056582 | LR: 8.93e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 109 | Avg Loss: 0.047941 | LR: 8.99e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 111/1500 [06:59<1:16:48, 3.32s/it]Epoch 110 | Step 1541/ 21000 | Loss: 0.038037 | LR: 9.00e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 110 | Avg Loss: 0.039082 | LR: 9.06e-04 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 7%|β | 112/1500 [07:02<1:17:21, 3.34s/it]Epoch 111 | Step 1555/ 21000 | Loss: 0.036922 | LR: 9.06e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 111 | Avg Loss: 0.037414 | LR: 9.13e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 113/1500 [07:05<1:16:59, 3.33s/it]Epoch 112 | Step 1569/ 21000 | Loss: 0.050282 | LR: 9.13e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 112 | Avg Loss: 0.052520 | LR: 9.19e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 114/1500 [07:09<1:18:00, 3.38s/it]Epoch 113 | Step 1583/ 21000 | Loss: 0.064816 | LR: 9.20e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 113 | Avg Loss: 0.060289 | LR: 9.26e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 115/1500 [07:12<1:18:39, 3.41s/it]Epoch 114 | Step 1597/ 21000 | Loss: 0.041934 | LR: 9.27e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 114 | Avg Loss: 0.044027 | LR: 9.33e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 116/1500 [07:16<1:18:17, 3.39s/it]Epoch 115 | Step 1611/ 21000 | Loss: 0.034175 | LR: 9.33e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 115 | Avg Loss: 0.039415 | LR: 9.40e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 117/1500 [07:19<1:17:52, 3.38s/it]Epoch 116 | Step 1625/ 21000 | Loss: 0.042169 | LR: 9.40e-04 | Speed: 3.7 steps/s | ETA: 1.5h |
| Epoch 116 | Avg Loss: 0.037331 | LR: 9.46e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 118/1500 [07:22<1:18:13, 3.40s/it]Epoch 117 | Step 1639/ 21000 | Loss: 0.035896 | LR: 9.47e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 117 | Avg Loss: 0.038203 | LR: 9.53e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 119/1500 [07:26<1:17:14, 3.36s/it]Epoch 118 | Step 1653/ 21000 | Loss: 0.044916 | LR: 9.53e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 118 | Avg Loss: 0.039857 | LR: 9.60e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 120/1500 [07:29<1:16:45, 3.34s/it]Epoch 119 | Step 1667/ 21000 | Loss: 0.036022 | LR: 9.60e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 119 | Avg Loss: 0.038705 | LR: 9.66e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 121/1500 [07:32<1:17:09, 3.36s/it]Epoch 120 | Step 1681/ 21000 | Loss: 0.053714 | LR: 9.67e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 120 | Avg Loss: 0.058985 | LR: 9.73e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 122/1500 [07:36<1:17:01, 3.35s/it]Epoch 121 | Step 1695/ 21000 | Loss: 0.053556 | LR: 9.74e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 121 | Avg Loss: 0.041034 | LR: 9.80e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 123/1500 [07:39<1:16:34, 3.34s/it]Epoch 122 | Step 1709/ 21000 | Loss: 0.031987 | LR: 9.80e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 122 | Avg Loss: 0.034538 | LR: 9.87e-04 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 124/1500 [07:42<1:16:48, 3.35s/it]Epoch 123 | Step 1723/ 21000 | Loss: 0.028273 | LR: 9.87e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 123 | Avg Loss: 0.040622 | LR: 9.93e-04 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 125/1500 [07:46<1:16:48, 3.35s/it]Epoch 124 | Step 1737/ 21000 | Loss: 0.065265 | LR: 9.94e-04 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 124 | Avg Loss: 0.081324 | LR: 1.00e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 126/1500 [07:49<1:17:06, 3.37s/it]Epoch 125 | Step 1751/ 21000 | Loss: 0.099073 | LR: 1.00e-03 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 125 | Avg Loss: 0.103266 | LR: 1.01e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 8%|β | 127/1500 [07:52<1:17:15, 3.38s/it]Epoch 126 | Step 1765/ 21000 | Loss: 0.116206 | LR: 1.01e-03 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 126 | Avg Loss: 0.114660 | LR: 1.01e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 128/1500 [07:56<1:16:20, 3.34s/it]Epoch 127 | Step 1779/ 21000 | Loss: 0.109866 | LR: 1.01e-03 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 127 | Avg Loss: 0.094334 | LR: 1.02e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 129/1500 [07:59<1:16:04, 3.33s/it]Epoch 128 | Step 1793/ 21000 | Loss: 0.082153 | LR: 1.02e-03 | Speed: 3.7 steps/s | ETA: 1.4h |
| Epoch 128 | Avg Loss: 0.070192 | LR: 1.03e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 130/1500 [08:02<1:15:52, 3.32s/it]Epoch 129 | Step 1807/ 21000 | Loss: 0.045357 | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 129 | Avg Loss: 0.047373 | LR: 1.03e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 131/1500 [08:06<1:15:16, 3.30s/it]Epoch 130 | Step 1821/ 21000 | Loss: 0.046692 | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 130 | Avg Loss: 0.043258 | LR: 1.04e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 132/1500 [08:09<1:15:25, 3.31s/it]Epoch 131 | Step 1835/ 21000 | Loss: 0.032795 | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 131 | Avg Loss: 0.051812 | LR: 1.05e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 133/1500 [08:12<1:15:32, 3.32s/it]Epoch 132 | Step 1849/ 21000 | Loss: 0.056749 | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 132 | Avg Loss: 0.048600 | LR: 1.05e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 134/1500 [08:16<1:15:27, 3.31s/it]Epoch 133 | Step 1863/ 21000 | Loss: 0.042399 | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 133 | Avg Loss: 0.042586 | LR: 1.06e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 135/1500 [08:19<1:15:14, 3.31s/it]Epoch 134 | Step 1877/ 21000 | Loss: 0.044964 | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 134 | Avg Loss: 0.048532 | LR: 1.07e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 136/1500 [08:22<1:15:20, 3.31s/it]Epoch 135 | Step 1891/ 21000 | Loss: 0.057734 | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 135 | Avg Loss: 0.048324 | LR: 1.07e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 137/1500 [08:25<1:15:08, 3.31s/it]Epoch 136 | Step 1905/ 21000 | Loss: 0.043287 | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 136 | Avg Loss: 0.038560 | LR: 1.08e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 138/1500 [08:29<1:15:21, 3.32s/it]Epoch 137 | Step 1919/ 21000 | Loss: 0.037605 | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 137 | Avg Loss: 0.043462 | LR: 1.09e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 139/1500 [08:32<1:15:11, 3.31s/it]Epoch 138 | Step 1933/ 21000 | Loss: 0.043478 | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 138 | Avg Loss: 0.042978 | LR: 1.09e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 140/1500 [08:36<1:15:58, 3.35s/it]Epoch 139 | Step 1947/ 21000 | Loss: 0.031243 | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 139 | Avg Loss: 0.041040 | LR: 1.10e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 141/1500 [08:39<1:15:50, 3.35s/it]Epoch 140 | Step 1961/ 21000 | Loss: 0.044837 | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 140 | Avg Loss: 0.039195 | LR: 1.11e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 9%|β | 142/1500 [08:42<1:16:55, 3.40s/it]Epoch 141 | Step 1975/ 21000 | Loss: 0.039025 | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 141 | Avg Loss: 0.038292 | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 143/1500 [08:46<1:15:56, 3.36s/it]Epoch 142 | Step 1989/ 21000 | Loss: 0.032414 | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 142 | Avg Loss: 0.033116 | LR: 1.12e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 144/1500 [08:49<1:15:17, 3.33s/it]Epoch 143 | Step 2003/ 21000 | Loss: 0.034082 | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 143 | Avg Loss: 0.041510 | LR: 1.13e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 145/1500 [08:52<1:15:48, 3.36s/it]Epoch 144 | Step 2017/ 21000 | Loss: 0.052641 | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 144 | Avg Loss: 0.055978 | LR: 1.13e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 146/1500 [08:56<1:15:24, 3.34s/it]Epoch 145 | Step 2031/ 21000 | Loss: 0.064799 | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 145 | Avg Loss: 0.064812 | LR: 1.14e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 147/1500 [08:59<1:16:12, 3.38s/it]Epoch 146 | Step 2045/ 21000 | Loss: 0.046493 | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 146 | Avg Loss: 0.043557 | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 148/1500 [09:03<1:16:25, 3.39s/it]Epoch 147 | Step 2059/ 21000 | Loss: 0.040831 | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 147 | Avg Loss: 0.047060 | LR: 1.15e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 149/1500 [09:06<1:17:16, 3.43s/it]Epoch 148 | Step 2073/ 21000 | Loss: 0.059485 | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 148 | Avg Loss: 0.055089 | LR: 1.16e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 150/1500 [09:09<1:16:53, 3.42s/it]Epoch 149 | Step 2087/ 21000 | Loss: 0.051692 | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 149 | Avg Loss: 0.043558 | LR: 1.17e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 151/1500 [09:13<1:16:00, 3.38s/it]Epoch 150 | Step 2101/ 21000 | Loss: 0.036517 | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 150 | Avg Loss: 0.031243 | LR: 1.17e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 152/1500 [09:16<1:15:44, 3.37s/it]Epoch 151 | Step 2115/ 21000 | Loss: 0.034116 | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 151 | Avg Loss: 0.036845 | LR: 1.18e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 153/1500 [09:20<1:16:03, 3.39s/it]Epoch 152 | Step 2129/ 21000 | Loss: 0.053794 | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 152 | Avg Loss: 0.048498 | LR: 1.19e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 154/1500 [09:23<1:15:33, 3.37s/it]Epoch 153 | Step 2143/ 21000 | Loss: 0.051553 | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 153 | Avg Loss: 0.072978 | LR: 1.19e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 155/1500 [09:26<1:15:20, 3.36s/it]Epoch 154 | Step 2157/ 21000 | Loss: 0.104989 | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 154 | Avg Loss: 0.123019 | LR: 1.20e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 156/1500 [09:30<1:15:30, 3.37s/it]Epoch 155 | Step 2171/ 21000 | Loss: 0.128032 | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 155 | Avg Loss: 0.139193 | LR: 1.21e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 10%|β | 157/1500 [09:33<1:16:11, 3.40s/it]Epoch 156 | Step 2185/ 21000 | Loss: 0.136658 | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 156 | Avg Loss: 0.101100 | LR: 1.22e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 158/1500 [09:36<1:15:12, 3.36s/it]Epoch 157 | Step 2199/ 21000 | Loss: 0.076101 | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 157 | Avg Loss: 0.064460 | LR: 1.22e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 159/1500 [09:40<1:15:15, 3.37s/it]Epoch 158 | Step 2213/ 21000 | Loss: 0.048741 | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 158 | Avg Loss: 0.048600 | LR: 1.23e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 160/1500 [09:43<1:14:39, 3.34s/it]Epoch 159 | Step 2227/ 21000 | Loss: 0.042711 | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 159 | Avg Loss: 0.044990 | LR: 1.24e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 161/1500 [09:46<1:14:06, 3.32s/it]Epoch 160 | Step 2241/ 21000 | Loss: 0.047004 | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 160 | Avg Loss: 0.041976 | LR: 1.24e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 162/1500 [09:50<1:15:12, 3.37s/it]Epoch 161 | Step 2255/ 21000 | Loss: 0.046487 | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 161 | Avg Loss: 0.054036 | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 163/1500 [09:53<1:16:14, 3.42s/it]Epoch 162 | Step 2269/ 21000 | Loss: 0.058590 | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 162 | Avg Loss: 0.059503 | LR: 1.26e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 164/1500 [09:57<1:14:52, 3.36s/it]Epoch 163 | Step 2283/ 21000 | Loss: 0.054144 | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 163 | Avg Loss: 0.046268 | LR: 1.26e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 165/1500 [10:00<1:14:05, 3.33s/it]Epoch 164 | Step 2297/ 21000 | Loss: 0.044274 | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 164 | Avg Loss: 0.044183 | LR: 1.27e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 166/1500 [10:03<1:13:54, 3.32s/it]Epoch 165 | Step 2311/ 21000 | Loss: 0.057007 | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 165 | Avg Loss: 0.052928 | LR: 1.28e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 167/1500 [10:07<1:14:52, 3.37s/it]Epoch 166 | Step 2325/ 21000 | Loss: 0.046256 | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 1.4h |
| Epoch 166 | Avg Loss: 0.045207 | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 11%|β | 168/1500 [10:10<1:14:07, 3.34s/it]Epoch 167 | Step 2339/ 21000 | Loss: 0.042863 | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 167 | Avg Loss: 0.042001 | LR: 1.29e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|ββ | 169/1500 [10:13<1:14:26, 3.36s/it]Epoch 168 | Step 2353/ 21000 | Loss: 0.041435 | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 168 | Avg Loss: 0.043884 | LR: 1.30e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 11%|ββ | 170/1500 [10:17<1:15:11, 3.39s/it]Epoch 169 | Step 2367/ 21000 | Loss: 0.040525 | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 169 | Avg Loss: 0.037584 | LR: 1.30e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 11%|ββ | 171/1500 [10:20<1:14:37, 3.37s/it]Epoch 170 | Step 2381/ 21000 | Loss: 0.045585 | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 170 | Avg Loss: 0.038397 | LR: 1.31e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 11%|ββ | 172/1500 [10:23<1:14:34, 3.37s/it]Epoch 171 | Step 2395/ 21000 | Loss: 0.044551 | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 171 | Avg Loss: 0.039978 | LR: 1.32e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 173/1500 [10:27<1:19:16, 3.58s/it]Epoch 172 | Step 2409/ 21000 | Loss: 0.056261 | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 172 | Avg Loss: 0.054898 | LR: 1.32e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 174/1500 [10:32<1:22:53, 3.75s/it]Epoch 173 | Step 2423/ 21000 | Loss: 0.055114 | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 173 | Avg Loss: 0.053797 | LR: 1.33e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 175/1500 [10:36<1:25:43, 3.88s/it]Epoch 174 | Step 2437/ 21000 | Loss: 0.052196 | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 174 | Avg Loss: 0.051997 | LR: 1.34e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 176/1500 [10:40<1:27:04, 3.95s/it]Epoch 175 | Step 2451/ 21000 | Loss: 0.043481 | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 175 | Avg Loss: 0.041886 | LR: 1.34e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 177/1500 [10:44<1:28:34, 4.02s/it]Epoch 176 | Step 2465/ 21000 | Loss: 0.042922 | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 176 | Avg Loss: 0.046460 | LR: 1.35e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 178/1500 [10:48<1:28:13, 4.00s/it]Epoch 177 | Step 2479/ 21000 | Loss: 0.058198 | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 177 | Avg Loss: 0.047011 | LR: 1.36e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 179/1500 [10:52<1:28:48, 4.03s/it]Epoch 178 | Step 2493/ 21000 | Loss: 0.071548 | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 178 | Avg Loss: 0.051371 | LR: 1.36e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 180/1500 [10:56<1:28:53, 4.04s/it]Epoch 179 | Step 2507/ 21000 | Loss: 0.050783 | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 179 | Avg Loss: 0.051891 | LR: 1.37e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 181/1500 [11:01<1:30:35, 4.12s/it]Epoch 180 | Step 2521/ 21000 | Loss: 0.048897 | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 180 | Avg Loss: 0.045531 | LR: 1.38e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 182/1500 [11:05<1:29:45, 4.09s/it]Epoch 181 | Step 2535/ 21000 | Loss: 0.034484 | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 181 | Avg Loss: 0.072573 | LR: 1.38e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 183/1500 [11:09<1:30:32, 4.13s/it]Epoch 182 | Step 2549/ 21000 | Loss: 0.119103 | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 182 | Avg Loss: 0.162198 | LR: 1.39e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 184/1500 [11:13<1:29:48, 4.09s/it]Epoch 183 | Step 2563/ 21000 | Loss: 0.277480 | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 183 | Avg Loss: 0.859323 | LR: 1.40e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 185/1500 [11:17<1:29:41, 4.09s/it]Epoch 184 | Step 2577/ 21000 | Loss: 1.219188 | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 184 | Avg Loss: 1.177793 | LR: 1.40e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 186/1500 [11:21<1:29:13, 4.07s/it]Epoch 185 | Step 2591/ 21000 | Loss: 1.151801 | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 185 | Avg Loss: 1.152081 | LR: 1.41e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 12%|ββ | 187/1500 [11:25<1:27:55, 4.02s/it]Epoch 186 | Step 2605/ 21000 | Loss: 1.148472 | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 186 | Avg Loss: 1.151613 | LR: 1.42e-03 | Time: 3.9s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 188/1500 [11:29<1:30:20, 4.13s/it]Epoch 187 | Step 2619/ 21000 | Loss: 1.150249 | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 187 | Avg Loss: 1.151310 | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 189/1500 [11:33<1:29:18, 4.09s/it]Epoch 188 | Step 2633/ 21000 | Loss: 1.151358 | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 188 | Avg Loss: 1.151897 | LR: 1.43e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 190/1500 [11:37<1:28:50, 4.07s/it]Epoch 189 | Step 2647/ 21000 | Loss: 1.152040 | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 189 | Avg Loss: 1.152013 | LR: 1.44e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 191/1500 [11:41<1:27:32, 4.01s/it]Epoch 190 | Step 2661/ 21000 | Loss: 1.151124 | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 190 | Avg Loss: 1.152279 | LR: 1.44e-03 | Time: 3.9s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 192/1500 [11:45<1:27:50, 4.03s/it]Epoch 191 | Step 2675/ 21000 | Loss: 1.152866 | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 191 | Avg Loss: 1.151694 | LR: 1.45e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 193/1500 [11:49<1:27:23, 4.01s/it]Epoch 192 | Step 2689/ 21000 | Loss: 1.154455 | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 192 | Avg Loss: 1.152716 | LR: 1.46e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 194/1500 [11:53<1:27:32, 4.02s/it]Epoch 193 | Step 2703/ 21000 | Loss: 1.152693 | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 193 | Avg Loss: 1.152033 | LR: 1.46e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 195/1500 [11:57<1:28:06, 4.05s/it]Epoch 194 | Step 2717/ 21000 | Loss: 1.152691 | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 194 | Avg Loss: 1.152434 | LR: 1.47e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 196/1500 [12:01<1:27:22, 4.02s/it]Epoch 195 | Step 2731/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 195 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.9s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 197/1500 [12:05<1:28:55, 4.10s/it]Epoch 196 | Step 2745/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 196 | Avg Loss: nan | LR: 1.48e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 198/1500 [12:10<1:28:32, 4.08s/it]Epoch 197 | Step 2759/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 197 | Avg Loss: nan | LR: 1.49e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 199/1500 [12:14<1:28:06, 4.06s/it]Epoch 198 | Step 2773/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 198 | Avg Loss: nan | LR: 1.50e-03 | Time: 4.0s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 200/1500 [12:18<1:28:09, 4.07s/it]Epoch 199 | Step 2787/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 199 | Avg Loss: nan | LR: 1.50e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 201/1500 [12:21<1:23:52, 3.87s/it]Epoch 200 | Step 2801/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 200 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 13%|ββ | 202/1500 [12:25<1:21:15, 3.76s/it]Epoch 201 | Step 2815/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 201 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 203/1500 [12:28<1:19:18, 3.67s/it]Epoch 202 | Step 2829/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 202 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 204/1500 [12:31<1:17:54, 3.61s/it]Epoch 203 | Step 2843/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 203 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 205/1500 [12:35<1:15:48, 3.51s/it]Epoch 204 | Step 2857/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 204 | Avg Loss: nan | LR: 1.54e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 206/1500 [12:39<1:20:30, 3.73s/it]Epoch 205 | Step 2871/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 205 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 207/1500 [12:43<1:24:25, 3.92s/it]Epoch 206 | Step 2885/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 206 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 208/1500 [12:48<1:26:55, 4.04s/it]Epoch 207 | Step 2899/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 207 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 209/1500 [12:52<1:29:08, 4.14s/it]Epoch 208 | Step 2913/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 208 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 210/1500 [12:56<1:29:38, 4.17s/it]Epoch 209 | Step 2927/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 209 | Avg Loss: nan | LR: 1.57e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 211/1500 [13:01<1:30:55, 4.23s/it]Epoch 210 | Step 2941/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 210 | Avg Loss: nan | LR: 1.58e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 212/1500 [13:05<1:32:27, 4.31s/it]Epoch 211 | Step 2955/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 211 | Avg Loss: nan | LR: 1.58e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 213/1500 [13:10<1:32:42, 4.32s/it]Epoch 212 | Step 2969/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 212 | Avg Loss: nan | LR: 1.59e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 214/1500 [13:14<1:33:04, 4.34s/it]Epoch 213 | Step 2983/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 213 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.4s | Samples: 6,983 |
| /home/edwardsun/miniconda3/envs/flow/lib/python3.9/site-packages/torch/optim/lr_scheduler.py:240: UserWarning: The epoch parameter in `scheduler.step()` was not necessary and is being deprecated where possible. Please use `scheduler.step()` to step the scheduler. During the deprecation, if epoch is different from None, the closed form is used instead of the new chainable form, where available. Please open an issue if you are unable to replicate your use case: https: |
| warnings.warn(EPOCH_DEPRECATION_WARNING, UserWarning) |
|
Training Flow Model: 14%|ββ | 215/1500 [13:18<1:32:08, 4.30s/it]Epoch 214 | Step 2997/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 214 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 216/1500 [13:22<1:31:31, 4.28s/it]Epoch 215 | Step 3011/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 215 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 14%|ββ | 217/1500 [13:27<1:31:12, 4.27s/it]Epoch 216 | Step 3025/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 216 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 218/1500 [13:31<1:30:35, 4.24s/it]Epoch 217 | Step 3039/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 217 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 219/1500 [13:35<1:30:45, 4.25s/it]Epoch 218 | Step 3053/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 218 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 220/1500 [13:39<1:31:51, 4.31s/it]Epoch 219 | Step 3067/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 219 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 221/1500 [13:44<1:31:51, 4.31s/it]Epoch 220 | Step 3081/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.7 steps/s | ETA: 1.3h |
| Epoch 220 | Avg Loss: nan | LR: 1.60e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 222/1500 [13:47<1:25:09, 4.00s/it]Epoch 221 | Step 3095/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.7 steps/s | ETA: 1.3h |
| Epoch 221 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 223/1500 [13:50<1:19:59, 3.76s/it]Epoch 222 | Step 3109/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.7 steps/s | ETA: 1.3h |
| Epoch 222 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 224/1500 [13:53<1:16:06, 3.58s/it]Epoch 223 | Step 3123/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 223 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 225/1500 [13:57<1:15:24, 3.55s/it]Epoch 224 | Step 3137/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 224 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 226/1500 [14:00<1:14:16, 3.50s/it]Epoch 225 | Step 3151/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 225 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 227/1500 [14:04<1:13:14, 3.45s/it]Epoch 226 | Step 3165/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 226 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 228/1500 [14:07<1:11:49, 3.39s/it]Epoch 227 | Step 3179/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 227 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 229/1500 [14:10<1:11:02, 3.35s/it]Epoch 228 | Step 3193/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 228 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 230/1500 [14:13<1:10:05, 3.31s/it]Epoch 229 | Step 3207/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 229 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 231/1500 [14:17<1:09:49, 3.30s/it]Epoch 230 | Step 3221/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 230 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 15%|ββ | 232/1500 [14:20<1:09:41, 3.30s/it]Epoch 231 | Step 3235/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 231 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 233/1500 [14:23<1:10:26, 3.34s/it]Epoch 232 | Step 3249/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 232 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 234/1500 [14:27<1:11:28, 3.39s/it]Epoch 233 | Step 3263/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 233 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 235/1500 [14:30<1:11:04, 3.37s/it]Epoch 234 | Step 3277/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 234 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 236/1500 [14:33<1:09:53, 3.32s/it]Epoch 235 | Step 3291/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 235 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 237/1500 [14:37<1:08:54, 3.27s/it]Epoch 236 | Step 3305/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 236 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 238/1500 [14:40<1:08:58, 3.28s/it]Epoch 237 | Step 3319/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 237 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 239/1500 [14:43<1:08:32, 3.26s/it]Epoch 238 | Step 3333/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 238 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 240/1500 [14:46<1:08:31, 3.26s/it]Epoch 239 | Step 3347/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 239 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 241/1500 [14:50<1:08:47, 3.28s/it]Epoch 240 | Step 3361/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 240 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 242/1500 [14:53<1:09:10, 3.30s/it]Epoch 241 | Step 3375/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 241 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 243/1500 [14:56<1:09:42, 3.33s/it]Epoch 242 | Step 3389/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 242 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 244/1500 [15:00<1:09:08, 3.30s/it]Epoch 243 | Step 3403/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 243 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 245/1500 [15:03<1:09:57, 3.34s/it]Epoch 244 | Step 3417/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 244 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 246/1500 [15:06<1:10:04, 3.35s/it]Epoch 245 | Step 3431/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 245 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 16%|ββ | 247/1500 [15:10<1:10:18, 3.37s/it]Epoch 246 | Step 3445/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 246 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 248/1500 [15:13<1:09:19, 3.32s/it]Epoch 247 | Step 3459/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 247 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 249/1500 [15:16<1:08:55, 3.31s/it]Epoch 248 | Step 3473/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 248 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 250/1500 [15:20<1:08:39, 3.30s/it]Epoch 249 | Step 3487/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 249 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 251/1500 [15:23<1:08:18, 3.28s/it]Epoch 250 | Step 3501/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 250 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 252/1500 [15:26<1:07:24, 3.24s/it]Epoch 251 | Step 3515/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 251 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.1s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 253/1500 [15:29<1:07:45, 3.26s/it]Epoch 252 | Step 3529/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 252 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 254/1500 [15:33<1:07:40, 3.26s/it]Epoch 253 | Step 3543/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 253 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 255/1500 [15:36<1:07:59, 3.28s/it]Epoch 254 | Step 3557/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 254 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 256/1500 [15:39<1:07:49, 3.27s/it]Epoch 255 | Step 3571/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 255 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 257/1500 [15:42<1:07:51, 3.28s/it]Epoch 256 | Step 3585/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 256 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 258/1500 [15:46<1:07:23, 3.26s/it]Epoch 257 | Step 3599/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 257 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 259/1500 [15:49<1:07:27, 3.26s/it]Epoch 258 | Step 3613/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 258 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 260/1500 [15:52<1:06:55, 3.24s/it]Epoch 259 | Step 3627/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 259 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 261/1500 [15:55<1:07:35, 3.27s/it]Epoch 260 | Step 3641/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 260 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 17%|ββ | 262/1500 [15:59<1:08:40, 3.33s/it]Epoch 261 | Step 3655/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 261 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 263/1500 [16:02<1:07:24, 3.27s/it]Epoch 262 | Step 3669/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 262 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.1s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 264/1500 [16:05<1:06:51, 3.25s/it]Epoch 263 | Step 3683/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 263 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 265/1500 [16:09<1:07:08, 3.26s/it]Epoch 264 | Step 3697/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 264 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 266/1500 [16:12<1:06:59, 3.26s/it]Epoch 265 | Step 3711/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 265 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 267/1500 [16:15<1:07:23, 3.28s/it]Epoch 266 | Step 3725/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 266 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 268/1500 [16:18<1:06:25, 3.24s/it]Epoch 267 | Step 3739/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 267 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.1s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 269/1500 [16:22<1:07:28, 3.29s/it]Epoch 268 | Step 3753/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 268 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 270/1500 [16:25<1:06:35, 3.25s/it]Epoch 269 | Step 3767/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.3h |
| Epoch 269 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 271/1500 [16:28<1:06:16, 3.24s/it]Epoch 270 | Step 3781/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 270 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 272/1500 [16:31<1:06:13, 3.24s/it]Epoch 271 | Step 3795/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 271 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 273/1500 [16:35<1:06:40, 3.26s/it]Epoch 272 | Step 3809/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 272 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 274/1500 [16:38<1:06:30, 3.25s/it]Epoch 273 | Step 3823/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 273 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 275/1500 [16:41<1:06:35, 3.26s/it]Epoch 274 | Step 3837/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 274 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 276/1500 [16:44<1:06:20, 3.25s/it]Epoch 275 | Step 3851/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 275 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 18%|ββ | 277/1500 [16:48<1:06:22, 3.26s/it]Epoch 276 | Step 3865/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 276 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 278/1500 [16:51<1:06:54, 3.29s/it]Epoch 277 | Step 3879/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 277 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 279/1500 [16:54<1:06:53, 3.29s/it]Epoch 278 | Step 3893/ 21000 | Loss: nan | LR: 1.60e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 278 | Avg Loss: nan | LR: 1.60e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 280/1500 [16:57<1:06:32, 3.27s/it]Epoch 279 | Step 3907/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 279 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 281/1500 [17:01<1:06:54, 3.29s/it]Epoch 280 | Step 3921/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 280 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 282/1500 [17:04<1:07:29, 3.32s/it]Epoch 281 | Step 3935/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 281 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 283/1500 [17:08<1:07:32, 3.33s/it]Epoch 282 | Step 3949/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 282 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 284/1500 [17:11<1:07:17, 3.32s/it]Epoch 283 | Step 3963/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 283 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 285/1500 [17:14<1:06:40, 3.29s/it]Epoch 284 | Step 3977/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.8 steps/s | ETA: 1.2h |
| Epoch 284 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 286/1500 [17:17<1:07:13, 3.32s/it]Epoch 285 | Step 3991/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 285 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 287/1500 [17:21<1:06:45, 3.30s/it]Epoch 286 | Step 4005/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 286 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 288/1500 [17:24<1:06:42, 3.30s/it]Epoch 287 | Step 4019/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 287 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 289/1500 [17:27<1:06:58, 3.32s/it]Epoch 288 | Step 4033/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 288 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 290/1500 [17:31<1:07:18, 3.34s/it]Epoch 289 | Step 4047/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 289 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 291/1500 [17:34<1:06:24, 3.30s/it]Epoch 290 | Step 4061/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 290 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 19%|ββ | 292/1500 [17:37<1:07:10, 3.34s/it]Epoch 291 | Step 4075/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 291 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 293/1500 [17:41<1:06:51, 3.32s/it]Epoch 292 | Step 4089/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 292 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 294/1500 [17:44<1:06:02, 3.29s/it]Epoch 293 | Step 4103/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 293 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 295/1500 [17:47<1:06:31, 3.31s/it]Epoch 294 | Step 4117/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 294 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 296/1500 [17:51<1:06:56, 3.34s/it]Epoch 295 | Step 4131/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 295 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 297/1500 [17:54<1:06:52, 3.34s/it]Epoch 296 | Step 4145/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 296 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 298/1500 [17:57<1:06:53, 3.34s/it]Epoch 297 | Step 4159/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 297 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 299/1500 [18:01<1:06:52, 3.34s/it]Epoch 298 | Step 4173/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 298 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 300/1500 [18:05<1:12:12, 3.61s/it]Epoch 299 | Step 4187/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 299 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.5s | Samples: 6,983 |
| β Checkpoint saved: /data2/edwardsun/flow_checkpoints/amp_flow_model_final_optimized.pth (loss: nan, step: 4200) |
|
Training Flow Model: 20%|ββ | 301/1500 [18:08<1:09:48, 3.49s/it]Epoch 300 | Step 4201/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 300 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 302/1500 [18:11<1:08:08, 3.41s/it]Epoch 301 | Step 4215/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 301 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 303/1500 [18:14<1:06:38, 3.34s/it]Epoch 302 | Step 4229/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 302 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 304/1500 [18:18<1:06:10, 3.32s/it]Epoch 303 | Step 4243/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 303 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 305/1500 [18:21<1:05:16, 3.28s/it]Epoch 304 | Step 4257/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 304 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 306/1500 [18:24<1:05:16, 3.28s/it]Epoch 305 | Step 4271/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 305 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 20%|ββ | 307/1500 [18:27<1:04:37, 3.25s/it]Epoch 306 | Step 4285/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 306 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 308/1500 [18:31<1:05:45, 3.31s/it]Epoch 307 | Step 4299/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 307 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 309/1500 [18:34<1:05:12, 3.28s/it]Epoch 308 | Step 4313/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 308 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 310/1500 [18:37<1:04:50, 3.27s/it]Epoch 309 | Step 4327/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 309 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 311/1500 [18:41<1:05:19, 3.30s/it]Epoch 310 | Step 4341/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 310 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 312/1500 [18:44<1:04:41, 3.27s/it]Epoch 311 | Step 4355/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 311 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 313/1500 [18:47<1:04:17, 3.25s/it]Epoch 312 | Step 4369/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 312 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 314/1500 [18:50<1:04:30, 3.26s/it]Epoch 313 | Step 4383/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 313 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 315/1500 [18:54<1:04:30, 3.27s/it]Epoch 314 | Step 4397/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 314 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 316/1500 [18:57<1:04:35, 3.27s/it]Epoch 315 | Step 4411/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 315 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 317/1500 [19:00<1:04:53, 3.29s/it]Epoch 316 | Step 4425/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 316 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|ββ | 318/1500 [19:04<1:04:38, 3.28s/it]Epoch 317 | Step 4439/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 317 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|βββ | 319/1500 [19:07<1:04:35, 3.28s/it]Epoch 318 | Step 4453/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 318 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 21%|βββ | 320/1500 [19:10<1:05:18, 3.32s/it]Epoch 319 | Step 4467/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 319 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 21%|βββ | 321/1500 [19:13<1:04:40, 3.29s/it]Epoch 320 | Step 4481/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 320 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 21%|βββ | 322/1500 [19:17<1:04:30, 3.29s/it]Epoch 321 | Step 4495/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 321 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 323/1500 [19:20<1:04:12, 3.27s/it]Epoch 322 | Step 4509/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 322 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 324/1500 [19:23<1:04:23, 3.29s/it]Epoch 323 | Step 4523/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 323 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 325/1500 [19:27<1:05:06, 3.32s/it]Epoch 324 | Step 4537/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 324 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 326/1500 [19:30<1:04:59, 3.32s/it]Epoch 325 | Step 4551/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 325 | Avg Loss: nan | LR: 1.59e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 327/1500 [19:33<1:04:33, 3.30s/it]Epoch 326 | Step 4565/ 21000 | Loss: nan | LR: 1.59e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 326 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 328/1500 [19:37<1:04:16, 3.29s/it]Epoch 327 | Step 4579/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 327 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 329/1500 [19:40<1:04:08, 3.29s/it]Epoch 328 | Step 4593/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 328 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 330/1500 [19:43<1:04:19, 3.30s/it]Epoch 329 | Step 4607/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 329 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 331/1500 [19:46<1:04:01, 3.29s/it]Epoch 330 | Step 4621/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 330 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 332/1500 [19:50<1:03:07, 3.24s/it]Epoch 331 | Step 4635/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 331 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.1s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 333/1500 [19:53<1:02:55, 3.24s/it]Epoch 332 | Step 4649/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 332 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 334/1500 [19:56<1:03:32, 3.27s/it]Epoch 333 | Step 4663/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 333 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 335/1500 [19:59<1:03:29, 3.27s/it]Epoch 334 | Step 4677/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 334 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 336/1500 [20:03<1:04:37, 3.33s/it]Epoch 335 | Step 4691/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 335 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 22%|βββ | 337/1500 [20:06<1:04:13, 3.31s/it]Epoch 336 | Step 4705/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 336 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 338/1500 [20:09<1:04:13, 3.32s/it]Epoch 337 | Step 4719/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 337 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 339/1500 [20:13<1:04:10, 3.32s/it]Epoch 338 | Step 4733/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 338 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 340/1500 [20:16<1:04:12, 3.32s/it]Epoch 339 | Step 4747/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 339 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 341/1500 [20:19<1:03:41, 3.30s/it]Epoch 340 | Step 4761/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 340 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 342/1500 [20:23<1:03:28, 3.29s/it]Epoch 341 | Step 4775/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 341 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 343/1500 [20:26<1:03:35, 3.30s/it]Epoch 342 | Step 4789/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 342 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 344/1500 [20:29<1:03:27, 3.29s/it]Epoch 343 | Step 4803/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.2h |
| Epoch 343 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 345/1500 [20:32<1:03:05, 3.28s/it]Epoch 344 | Step 4817/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 344 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 346/1500 [20:36<1:02:46, 3.26s/it]Epoch 345 | Step 4831/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 345 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 347/1500 [20:39<1:02:51, 3.27s/it]Epoch 346 | Step 4845/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 346 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 348/1500 [20:42<1:02:47, 3.27s/it]Epoch 347 | Step 4859/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 347 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 349/1500 [20:45<1:02:23, 3.25s/it]Epoch 348 | Step 4873/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 348 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 350/1500 [20:49<1:02:23, 3.25s/it]Epoch 349 | Step 4887/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 349 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 351/1500 [20:52<1:03:03, 3.29s/it]Epoch 350 | Step 4901/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 350 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 23%|βββ | 352/1500 [20:55<1:03:01, 3.29s/it]Epoch 351 | Step 4915/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 351 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 353/1500 [20:59<1:02:28, 3.27s/it]Epoch 352 | Step 4929/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 352 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 354/1500 [21:02<1:02:57, 3.30s/it]Epoch 353 | Step 4943/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 353 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 355/1500 [21:05<1:03:38, 3.33s/it]Epoch 354 | Step 4957/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 354 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 356/1500 [21:09<1:02:43, 3.29s/it]Epoch 355 | Step 4971/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 355 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.2s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 357/1500 [21:12<1:03:37, 3.34s/it]Epoch 356 | Step 4985/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 356 | Avg Loss: nan | LR: 1.58e-03 | Time: 3.5s | Samples: 6,983 |
| /data2/edwardsun/flow_home/cfg_dataset.py:360: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). |
| 'index': torch.tensor(idx, dtype=torch.long) |
| /data2/edwardsun/flow_home/cfg_dataset.py:360: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). |
| 'index': torch.tensor(idx, dtype=torch.long) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 24%|βββ | 358/1500 [21:23<1:47:36, 5.65s/it]Epoch 357 | Step 4999/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Validation at step 5000: Loss = nan |
| Epoch 357 | Avg Loss: nan | LR: 1.58e-03 | Time: 11.1s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 359/1500 [21:28<1:41:54, 5.36s/it]Epoch 358 | Step 5013/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 358 | Avg Loss: nan | LR: 1.58e-03 | Time: 4.7s | Samples: 6,983 |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 24%|βββ | 360/1500 [21:31<1:31:23, 4.81s/it]Epoch 359 | Step 5027/ 21000 | Loss: nan | LR: 1.58e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 359 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 361/1500 [21:35<1:24:26, 4.45s/it]Epoch 360 | Step 5041/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 360 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 362/1500 [21:38<1:18:56, 4.16s/it]Epoch 361 | Step 5055/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 361 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 363/1500 [21:42<1:15:13, 3.97s/it]Epoch 362 | Step 5069/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 362 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 364/1500 [21:46<1:13:18, 3.87s/it]Epoch 363 | Step 5083/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 363 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 365/1500 [21:49<1:11:53, 3.80s/it]Epoch 364 | Step 5097/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 364 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 366/1500 [21:53<1:09:30, 3.68s/it]Epoch 365 | Step 5111/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 365 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 24%|βββ | 367/1500 [21:56<1:08:54, 3.65s/it]Epoch 366 | Step 5125/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 366 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 368/1500 [22:00<1:08:09, 3.61s/it]Epoch 367 | Step 5139/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 367 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 369/1500 [22:03<1:07:28, 3.58s/it]Epoch 368 | Step 5153/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 368 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 370/1500 [22:07<1:06:21, 3.52s/it]Epoch 369 | Step 5167/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 369 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 371/1500 [22:10<1:05:47, 3.50s/it]Epoch 370 | Step 5181/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 370 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 372/1500 [22:14<1:05:51, 3.50s/it]Epoch 371 | Step 5195/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 371 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 373/1500 [22:17<1:06:52, 3.56s/it]Epoch 372 | Step 5209/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 372 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 374/1500 [22:21<1:06:45, 3.56s/it]Epoch 373 | Step 5223/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 373 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 375/1500 [22:24<1:05:47, 3.51s/it]Epoch 374 | Step 5237/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 374 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 376/1500 [22:28<1:05:40, 3.51s/it]Epoch 375 | Step 5251/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 375 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 377/1500 [22:31<1:05:45, 3.51s/it]Epoch 376 | Step 5265/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 376 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 378/1500 [22:35<1:05:47, 3.52s/it]Epoch 377 | Step 5279/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 377 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 379/1500 [22:38<1:05:39, 3.51s/it]Epoch 378 | Step 5293/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 378 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 380/1500 [22:42<1:05:59, 3.54s/it]Epoch 379 | Step 5307/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 379 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 381/1500 [22:45<1:05:11, 3.50s/it]Epoch 380 | Step 5321/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 380 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 25%|βββ | 382/1500 [22:49<1:05:44, 3.53s/it]Epoch 381 | Step 5335/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 381 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 383/1500 [22:52<1:05:15, 3.51s/it]Epoch 382 | Step 5349/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 382 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 384/1500 [22:56<1:05:31, 3.52s/it]Epoch 383 | Step 5363/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 383 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 385/1500 [22:59<1:05:54, 3.55s/it]Epoch 384 | Step 5377/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 384 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 386/1500 [23:03<1:05:27, 3.53s/it]Epoch 385 | Step 5391/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 385 | Avg Loss: nan | LR: 1.57e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 387/1500 [23:07<1:05:58, 3.56s/it]Epoch 386 | Step 5405/ 21000 | Loss: nan | LR: 1.57e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 386 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 388/1500 [23:10<1:05:05, 3.51s/it]Epoch 387 | Step 5419/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 387 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 389/1500 [23:14<1:05:30, 3.54s/it]Epoch 388 | Step 5433/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 388 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 390/1500 [23:17<1:05:29, 3.54s/it]Epoch 389 | Step 5447/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 389 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 391/1500 [23:21<1:04:59, 3.52s/it]Epoch 390 | Step 5461/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 390 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 392/1500 [23:24<1:05:49, 3.56s/it]Epoch 391 | Step 5475/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 391 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 393/1500 [23:28<1:04:30, 3.50s/it]Epoch 392 | Step 5489/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 392 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 394/1500 [23:31<1:04:20, 3.49s/it]Epoch 393 | Step 5503/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 393 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 395/1500 [23:35<1:05:23, 3.55s/it]Epoch 394 | Step 5517/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 394 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 396/1500 [23:38<1:04:55, 3.53s/it]Epoch 395 | Step 5531/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 395 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 26%|βββ | 397/1500 [23:42<1:03:54, 3.48s/it]Epoch 396 | Step 5545/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 396 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 398/1500 [23:45<1:03:39, 3.47s/it]Epoch 397 | Step 5559/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 397 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 399/1500 [23:49<1:04:17, 3.50s/it]Epoch 398 | Step 5573/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 398 | Avg Loss: nan | LR: 1.56e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 400/1500 [23:53<1:09:11, 3.77s/it]Epoch 399 | Step 5587/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 399 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 401/1500 [23:57<1:11:54, 3.93s/it]Epoch 400 | Step 5601/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 400 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 402/1500 [24:02<1:14:30, 4.07s/it]Epoch 401 | Step 5615/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 401 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 403/1500 [24:06<1:17:32, 4.24s/it]Epoch 402 | Step 5629/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 402 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 404/1500 [24:11<1:19:07, 4.33s/it]Epoch 403 | Step 5643/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 403 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 405/1500 [24:15<1:19:33, 4.36s/it]Epoch 404 | Step 5657/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 404 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 406/1500 [24:20<1:19:02, 4.34s/it]Epoch 405 | Step 5671/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 405 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 407/1500 [24:24<1:19:31, 4.37s/it]Epoch 406 | Step 5685/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 406 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 408/1500 [24:29<1:21:16, 4.47s/it]Epoch 407 | Step 5699/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 407 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 409/1500 [24:33<1:21:00, 4.46s/it]Epoch 408 | Step 5713/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 408 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 410/1500 [24:38<1:20:49, 4.45s/it]Epoch 409 | Step 5727/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 409 | Avg Loss: nan | LR: 1.56e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 411/1500 [24:42<1:20:13, 4.42s/it]Epoch 410 | Step 5741/ 21000 | Loss: nan | LR: 1.56e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 410 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 27%|βββ | 412/1500 [24:46<1:20:40, 4.45s/it]Epoch 411 | Step 5755/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 411 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 413/1500 [24:51<1:22:04, 4.53s/it]Epoch 412 | Step 5769/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 412 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 414/1500 [24:56<1:21:20, 4.49s/it]Epoch 413 | Step 5783/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 413 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 415/1500 [25:00<1:21:11, 4.49s/it]Epoch 414 | Step 5797/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 414 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 416/1500 [25:05<1:20:52, 4.48s/it]Epoch 415 | Step 5811/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 415 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 417/1500 [25:09<1:20:19, 4.45s/it]Epoch 416 | Step 5825/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 416 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 418/1500 [25:14<1:21:22, 4.51s/it]Epoch 417 | Step 5839/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 417 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 419/1500 [25:18<1:20:32, 4.47s/it]Epoch 418 | Step 5853/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 418 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 420/1500 [25:22<1:20:12, 4.46s/it]Epoch 419 | Step 5867/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 419 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 421/1500 [25:27<1:19:15, 4.41s/it]Epoch 420 | Step 5881/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 420 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 422/1500 [25:31<1:18:44, 4.38s/it]Epoch 421 | Step 5895/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 421 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 423/1500 [25:36<1:21:13, 4.52s/it]Epoch 422 | Step 5909/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 422 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 424/1500 [25:40<1:20:07, 4.47s/it]Epoch 423 | Step 5923/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.9 steps/s | ETA: 1.1h |
| Epoch 423 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 425/1500 [25:45<1:19:32, 4.44s/it]Epoch 424 | Step 5937/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 424 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 426/1500 [25:49<1:19:20, 4.43s/it]Epoch 425 | Step 5951/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 425 | Avg Loss: nan | LR: 1.55e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 28%|βββ | 427/1500 [25:53<1:15:10, 4.20s/it]Epoch 426 | Step 5965/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 426 | Avg Loss: nan | LR: 1.55e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 428/1500 [25:56<1:11:44, 4.02s/it]Epoch 427 | Step 5979/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 427 | Avg Loss: nan | LR: 1.55e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 429/1500 [26:00<1:09:48, 3.91s/it]Epoch 428 | Step 5993/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 428 | Avg Loss: nan | LR: 1.55e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 430/1500 [26:04<1:08:33, 3.84s/it]Epoch 429 | Step 6007/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 429 | Avg Loss: nan | LR: 1.55e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 431/1500 [26:07<1:08:36, 3.85s/it]Epoch 430 | Step 6021/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 430 | Avg Loss: nan | LR: 1.55e-03 | Time: 3.9s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 432/1500 [26:12<1:13:09, 4.11s/it]Epoch 431 | Step 6035/ 21000 | Loss: nan | LR: 1.55e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 431 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 433/1500 [26:17<1:16:17, 4.29s/it]Epoch 432 | Step 6049/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 432 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 434/1500 [26:22<1:18:12, 4.40s/it]Epoch 433 | Step 6063/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 433 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 435/1500 [26:26<1:18:36, 4.43s/it]Epoch 434 | Step 6077/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 434 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 436/1500 [26:30<1:18:39, 4.44s/it]Epoch 435 | Step 6091/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 435 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 437/1500 [26:35<1:18:55, 4.45s/it]Epoch 436 | Step 6105/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 436 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 438/1500 [26:39<1:19:09, 4.47s/it]Epoch 437 | Step 6119/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 437 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 439/1500 [26:44<1:19:40, 4.51s/it]Epoch 438 | Step 6133/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 438 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 440/1500 [26:49<1:20:46, 4.57s/it]Epoch 439 | Step 6147/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 439 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 441/1500 [26:54<1:22:05, 4.65s/it]Epoch 440 | Step 6161/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 440 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 29%|βββ | 442/1500 [26:59<1:24:09, 4.77s/it]Epoch 441 | Step 6175/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 441 | Avg Loss: nan | LR: 1.54e-03 | Time: 5.1s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 443/1500 [27:03<1:23:08, 4.72s/it]Epoch 442 | Step 6189/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 442 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 444/1500 [27:08<1:22:16, 4.67s/it]Epoch 443 | Step 6203/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 443 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 445/1500 [27:12<1:20:44, 4.59s/it]Epoch 444 | Step 6217/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 444 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 446/1500 [27:17<1:19:25, 4.52s/it]Epoch 445 | Step 6231/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 445 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 447/1500 [27:21<1:19:01, 4.50s/it]Epoch 446 | Step 6245/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 446 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 448/1500 [27:26<1:19:19, 4.52s/it]Epoch 447 | Step 6259/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 447 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 449/1500 [27:30<1:20:49, 4.61s/it]Epoch 448 | Step 6273/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 448 | Avg Loss: nan | LR: 1.54e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 450/1500 [27:34<1:14:41, 4.27s/it]Epoch 449 | Step 6287/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 449 | Avg Loss: nan | LR: 1.54e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 451/1500 [27:37<1:10:43, 4.05s/it]Epoch 450 | Step 6301/ 21000 | Loss: nan | LR: 1.54e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 450 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 452/1500 [27:41<1:08:11, 3.90s/it]Epoch 451 | Step 6315/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 451 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 453/1500 [27:45<1:06:23, 3.80s/it]Epoch 452 | Step 6329/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 452 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 454/1500 [27:48<1:04:45, 3.71s/it]Epoch 453 | Step 6343/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 453 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 455/1500 [27:52<1:03:32, 3.65s/it]Epoch 454 | Step 6357/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 454 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 456/1500 [27:55<1:02:47, 3.61s/it]Epoch 455 | Step 6371/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 455 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 30%|βββ | 457/1500 [27:59<1:02:30, 3.60s/it]Epoch 456 | Step 6385/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 456 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 458/1500 [28:02<1:02:44, 3.61s/it]Epoch 457 | Step 6399/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 457 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 459/1500 [28:06<1:02:36, 3.61s/it]Epoch 458 | Step 6413/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 458 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 460/1500 [28:09<1:01:45, 3.56s/it]Epoch 459 | Step 6427/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 459 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 461/1500 [28:13<1:01:30, 3.55s/it]Epoch 460 | Step 6441/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 460 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 462/1500 [28:16<1:01:26, 3.55s/it]Epoch 461 | Step 6455/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 461 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 463/1500 [28:20<1:01:05, 3.53s/it]Epoch 462 | Step 6469/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 462 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 464/1500 [28:23<1:00:24, 3.50s/it]Epoch 463 | Step 6483/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 463 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 465/1500 [28:27<1:00:00, 3.48s/it]Epoch 464 | Step 6497/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 464 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 466/1500 [28:30<59:39, 3.46s/it] Epoch 465 | Step 6511/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 465 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 467/1500 [28:34<1:00:00, 3.49s/it]Epoch 466 | Step 6525/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 466 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|βββ | 468/1500 [28:37<1:00:00, 3.49s/it]Epoch 467 | Step 6539/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 467 | Avg Loss: nan | LR: 1.53e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|ββββ | 469/1500 [28:41<1:00:21, 3.51s/it]Epoch 468 | Step 6553/ 21000 | Loss: nan | LR: 1.53e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 468 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 31%|ββββ | 470/1500 [28:44<1:00:19, 3.51s/it]Epoch 469 | Step 6567/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 469 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|ββββ | 471/1500 [28:48<1:00:05, 3.50s/it]Epoch 470 | Step 6581/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.1h |
| Epoch 470 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 31%|ββββ | 472/1500 [28:51<1:00:27, 3.53s/it]Epoch 471 | Step 6595/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 471 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 473/1500 [28:55<1:00:55, 3.56s/it]Epoch 472 | Step 6609/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 472 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 474/1500 [28:59<1:00:30, 3.54s/it]Epoch 473 | Step 6623/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 473 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 475/1500 [29:02<59:48, 3.50s/it] Epoch 474 | Step 6637/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 474 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 476/1500 [29:05<59:26, 3.48s/it]Epoch 475 | Step 6651/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 475 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 477/1500 [29:09<59:15, 3.48s/it]Epoch 476 | Step 6665/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 476 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 478/1500 [29:12<59:45, 3.51s/it]Epoch 477 | Step 6679/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 477 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 479/1500 [29:16<1:00:07, 3.53s/it]Epoch 478 | Step 6693/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 478 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 480/1500 [29:20<1:00:19, 3.55s/it]Epoch 479 | Step 6707/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 479 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 481/1500 [29:23<1:00:05, 3.54s/it]Epoch 480 | Step 6721/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 480 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 482/1500 [29:27<59:34, 3.51s/it] Epoch 481 | Step 6735/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 481 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 483/1500 [29:30<59:02, 3.48s/it]Epoch 482 | Step 6749/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 482 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 484/1500 [29:33<58:19, 3.44s/it]Epoch 483 | Step 6763/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 483 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 485/1500 [29:37<58:13, 3.44s/it]Epoch 484 | Step 6777/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 484 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 486/1500 [29:40<58:59, 3.49s/it]Epoch 485 | Step 6791/ 21000 | Loss: nan | LR: 1.52e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 485 | Avg Loss: nan | LR: 1.52e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 32%|ββββ | 487/1500 [29:44<59:23, 3.52s/it]Epoch 486 | Step 6805/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 486 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 488/1500 [29:47<59:31, 3.53s/it]Epoch 487 | Step 6819/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 487 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 489/1500 [29:51<59:47, 3.55s/it]Epoch 488 | Step 6833/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 488 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 490/1500 [29:55<1:00:01, 3.57s/it]Epoch 489 | Step 6847/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 489 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 491/1500 [29:58<59:00, 3.51s/it] Epoch 490 | Step 6861/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 490 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 492/1500 [30:02<59:17, 3.53s/it]Epoch 491 | Step 6875/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 491 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 493/1500 [30:05<58:39, 3.50s/it]Epoch 492 | Step 6889/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 492 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 494/1500 [30:09<59:00, 3.52s/it]Epoch 493 | Step 6903/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 493 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 495/1500 [30:12<59:06, 3.53s/it]Epoch 494 | Step 6917/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 494 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 496/1500 [30:16<59:30, 3.56s/it]Epoch 495 | Step 6931/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 495 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 497/1500 [30:19<58:51, 3.52s/it]Epoch 496 | Step 6945/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 496 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 498/1500 [30:23<58:52, 3.53s/it]Epoch 497 | Step 6959/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 497 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 499/1500 [30:26<58:19, 3.50s/it]Epoch 498 | Step 6973/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 498 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 500/1500 [30:30<58:08, 3.49s/it]Epoch 499 | Step 6987/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 499 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 501/1500 [30:33<59:39, 3.58s/it]Epoch 500 | Step 7001/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 500 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 33%|ββββ | 502/1500 [30:37<58:41, 3.53s/it]Epoch 501 | Step 7015/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 501 | Avg Loss: nan | LR: 1.51e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 503/1500 [30:41<59:13, 3.56s/it]Epoch 502 | Step 7029/ 21000 | Loss: nan | LR: 1.51e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 502 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 504/1500 [30:44<58:02, 3.50s/it]Epoch 503 | Step 7043/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 503 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 505/1500 [30:47<57:49, 3.49s/it]Epoch 504 | Step 7057/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 504 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 506/1500 [30:51<57:38, 3.48s/it]Epoch 505 | Step 7071/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 505 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 507/1500 [30:54<57:36, 3.48s/it]Epoch 506 | Step 7085/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 506 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 508/1500 [30:58<57:41, 3.49s/it]Epoch 507 | Step 7099/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 507 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 509/1500 [31:01<57:36, 3.49s/it]Epoch 508 | Step 7113/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 508 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 510/1500 [31:05<58:04, 3.52s/it]Epoch 509 | Step 7127/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 509 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 511/1500 [31:08<57:59, 3.52s/it]Epoch 510 | Step 7141/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 510 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 512/1500 [31:12<57:54, 3.52s/it]Epoch 511 | Step 7155/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 511 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 513/1500 [31:16<58:16, 3.54s/it]Epoch 512 | Step 7169/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 512 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 514/1500 [31:19<58:21, 3.55s/it]Epoch 513 | Step 7183/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 513 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 515/1500 [31:23<57:42, 3.52s/it]Epoch 514 | Step 7197/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 514 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 516/1500 [31:26<58:04, 3.54s/it]Epoch 515 | Step 7211/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 515 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 34%|ββββ | 517/1500 [31:30<57:49, 3.53s/it]Epoch 516 | Step 7225/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 516 | Avg Loss: nan | LR: 1.50e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 518/1500 [31:33<58:03, 3.55s/it]Epoch 517 | Step 7239/ 21000 | Loss: nan | LR: 1.50e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 517 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 519/1500 [31:37<58:06, 3.55s/it]Epoch 518 | Step 7253/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 518 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 520/1500 [31:40<57:57, 3.55s/it]Epoch 519 | Step 7267/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 519 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 521/1500 [31:44<57:56, 3.55s/it]Epoch 520 | Step 7281/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 520 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 522/1500 [31:48<58:21, 3.58s/it]Epoch 521 | Step 7295/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 521 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 523/1500 [31:51<58:30, 3.59s/it]Epoch 522 | Step 7309/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 522 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 524/1500 [31:55<57:58, 3.56s/it]Epoch 523 | Step 7323/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 523 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 525/1500 [31:58<57:44, 3.55s/it]Epoch 524 | Step 7337/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 524 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 526/1500 [32:02<57:13, 3.53s/it]Epoch 525 | Step 7351/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 525 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 527/1500 [32:05<56:43, 3.50s/it]Epoch 526 | Step 7365/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 526 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 528/1500 [32:08<56:03, 3.46s/it]Epoch 527 | Step 7379/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 527 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 529/1500 [32:12<55:45, 3.45s/it]Epoch 528 | Step 7393/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 528 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 530/1500 [32:15<55:28, 3.43s/it]Epoch 529 | Step 7407/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 529 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 531/1500 [32:19<55:12, 3.42s/it]Epoch 530 | Step 7421/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 530 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 35%|ββββ | 532/1500 [32:22<56:04, 3.48s/it]Epoch 531 | Step 7435/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 531 | Avg Loss: nan | LR: 1.49e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 533/1500 [32:26<57:14, 3.55s/it]Epoch 532 | Step 7449/ 21000 | Loss: nan | LR: 1.49e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 532 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 534/1500 [32:30<57:27, 3.57s/it]Epoch 533 | Step 7463/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 533 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 535/1500 [32:33<57:37, 3.58s/it]Epoch 534 | Step 7477/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 534 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 536/1500 [32:37<57:43, 3.59s/it]Epoch 535 | Step 7491/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 535 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 537/1500 [32:40<57:28, 3.58s/it]Epoch 536 | Step 7505/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 536 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 538/1500 [32:44<56:52, 3.55s/it]Epoch 537 | Step 7519/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 537 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 539/1500 [32:47<56:24, 3.52s/it]Epoch 538 | Step 7533/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 538 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 540/1500 [32:51<56:09, 3.51s/it]Epoch 539 | Step 7547/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 539 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 541/1500 [32:54<56:32, 3.54s/it]Epoch 540 | Step 7561/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 540 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 542/1500 [32:58<56:23, 3.53s/it]Epoch 541 | Step 7575/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 541 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 543/1500 [33:01<56:12, 3.52s/it]Epoch 542 | Step 7589/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 542 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 544/1500 [33:05<56:04, 3.52s/it]Epoch 543 | Step 7603/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 543 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 545/1500 [33:08<55:42, 3.50s/it]Epoch 544 | Step 7617/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 544 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 546/1500 [33:12<55:43, 3.51s/it]Epoch 545 | Step 7631/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 545 | Avg Loss: nan | LR: 1.48e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 36%|ββββ | 547/1500 [33:15<55:04, 3.47s/it]Epoch 546 | Step 7645/ 21000 | Loss: nan | LR: 1.48e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 546 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 548/1500 [33:19<55:41, 3.51s/it]Epoch 547 | Step 7659/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 547 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 549/1500 [33:22<55:23, 3.50s/it]Epoch 548 | Step 7673/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 548 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 550/1500 [33:26<54:46, 3.46s/it]Epoch 549 | Step 7687/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 549 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 551/1500 [33:29<55:10, 3.49s/it]Epoch 550 | Step 7701/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 550 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 552/1500 [33:33<54:26, 3.45s/it]Epoch 551 | Step 7715/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 551 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 553/1500 [33:36<56:02, 3.55s/it]Epoch 552 | Step 7729/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 552 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 554/1500 [33:40<56:17, 3.57s/it]Epoch 553 | Step 7743/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 553 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 555/1500 [33:43<55:37, 3.53s/it]Epoch 554 | Step 7757/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 554 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 556/1500 [33:47<55:15, 3.51s/it]Epoch 555 | Step 7771/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 555 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 557/1500 [33:50<55:04, 3.50s/it]Epoch 556 | Step 7785/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 556 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 558/1500 [33:54<55:09, 3.51s/it]Epoch 557 | Step 7799/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 557 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 559/1500 [33:57<54:56, 3.50s/it]Epoch 558 | Step 7813/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 558 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 560/1500 [34:01<55:04, 3.52s/it]Epoch 559 | Step 7827/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 559 | Avg Loss: nan | LR: 1.47e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 561/1500 [34:04<54:44, 3.50s/it]Epoch 560 | Step 7841/ 21000 | Loss: nan | LR: 1.47e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 560 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 37%|ββββ | 562/1500 [34:08<54:37, 3.49s/it]Epoch 561 | Step 7855/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 561 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 563/1500 [34:11<54:34, 3.49s/it]Epoch 562 | Step 7869/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 1.0h |
| Epoch 562 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 564/1500 [34:15<54:15, 3.48s/it]Epoch 563 | Step 7883/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 563 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 565/1500 [34:18<54:46, 3.51s/it]Epoch 564 | Step 7897/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 564 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 566/1500 [34:22<55:00, 3.53s/it]Epoch 565 | Step 7911/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 565 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 567/1500 [34:26<55:07, 3.54s/it]Epoch 566 | Step 7925/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 566 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 568/1500 [34:29<56:01, 3.61s/it]Epoch 567 | Step 7939/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 567 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 569/1500 [34:33<56:01, 3.61s/it]Epoch 568 | Step 7953/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 568 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 570/1500 [34:37<55:39, 3.59s/it]Epoch 569 | Step 7967/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 569 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 571/1500 [34:40<55:09, 3.56s/it]Epoch 570 | Step 7981/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 570 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 572/1500 [34:43<54:02, 3.49s/it]Epoch 571 | Step 7995/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 571 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 573/1500 [34:47<53:25, 3.46s/it]Epoch 572 | Step 8009/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 572 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 574/1500 [34:50<54:30, 3.53s/it]Epoch 573 | Step 8023/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 573 | Avg Loss: nan | LR: 1.46e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 575/1500 [34:54<53:44, 3.49s/it]Epoch 574 | Step 8037/ 21000 | Loss: nan | LR: 1.46e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 574 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 576/1500 [34:57<54:19, 3.53s/it]Epoch 575 | Step 8051/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 575 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 38%|ββββ | 577/1500 [35:01<54:23, 3.54s/it]Epoch 576 | Step 8065/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 576 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 578/1500 [35:04<54:08, 3.52s/it]Epoch 577 | Step 8079/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 577 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 579/1500 [35:08<54:16, 3.54s/it]Epoch 578 | Step 8093/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 578 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 580/1500 [35:11<53:27, 3.49s/it]Epoch 579 | Step 8107/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 579 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 581/1500 [35:15<53:39, 3.50s/it]Epoch 580 | Step 8121/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 580 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 582/1500 [35:18<53:09, 3.47s/it]Epoch 581 | Step 8135/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 581 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 583/1500 [35:22<54:02, 3.54s/it]Epoch 582 | Step 8149/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 582 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 584/1500 [35:26<54:09, 3.55s/it]Epoch 583 | Step 8163/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 583 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 585/1500 [35:29<53:34, 3.51s/it]Epoch 584 | Step 8177/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 584 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 586/1500 [35:33<53:54, 3.54s/it]Epoch 585 | Step 8191/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 585 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 587/1500 [35:36<53:21, 3.51s/it]Epoch 586 | Step 8205/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 586 | Avg Loss: nan | LR: 1.45e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 588/1500 [35:40<53:30, 3.52s/it]Epoch 587 | Step 8219/ 21000 | Loss: nan | LR: 1.45e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 587 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 589/1500 [35:43<53:41, 3.54s/it]Epoch 588 | Step 8233/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 588 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 590/1500 [35:47<53:57, 3.56s/it]Epoch 589 | Step 8247/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 589 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 591/1500 [35:50<52:51, 3.49s/it]Epoch 590 | Step 8261/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 590 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 39%|ββββ | 592/1500 [35:54<52:58, 3.50s/it]Epoch 591 | Step 8275/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 591 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 593/1500 [35:57<53:04, 3.51s/it]Epoch 592 | Step 8289/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 592 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 594/1500 [36:01<53:07, 3.52s/it]Epoch 593 | Step 8303/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 593 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 595/1500 [36:04<52:58, 3.51s/it]Epoch 594 | Step 8317/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 594 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 596/1500 [36:08<53:02, 3.52s/it]Epoch 595 | Step 8331/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 595 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 597/1500 [36:11<52:29, 3.49s/it]Epoch 596 | Step 8345/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 596 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 598/1500 [36:15<52:00, 3.46s/it]Epoch 597 | Step 8359/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 597 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 599/1500 [36:18<51:57, 3.46s/it]Epoch 598 | Step 8373/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 598 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 600/1500 [36:25<1:07:13, 4.48s/it]Epoch 599 | Step 8387/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 599 | Avg Loss: nan | LR: 1.44e-03 | Time: 3.6s | Samples: 6,983 |
| β Checkpoint saved: /data2/edwardsun/flow_checkpoints/amp_flow_model_final_optimized.pth (loss: nan, step: 8400) |
|
Training Flow Model: 40%|ββββ | 601/1500 [36:28<1:02:32, 4.17s/it]Epoch 600 | Step 8401/ 21000 | Loss: nan | LR: 1.44e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 600 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 602/1500 [36:32<59:24, 3.97s/it] Epoch 601 | Step 8415/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 601 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 603/1500 [36:35<57:11, 3.83s/it]Epoch 602 | Step 8429/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 602 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 604/1500 [36:39<55:32, 3.72s/it]Epoch 603 | Step 8443/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 603 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 605/1500 [36:42<54:27, 3.65s/it]Epoch 604 | Step 8457/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 604 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 606/1500 [36:46<54:09, 3.63s/it]Epoch 605 | Step 8471/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 605 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 40%|ββββ | 607/1500 [36:50<54:46, 3.68s/it]Epoch 606 | Step 8485/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 606 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 608/1500 [36:53<54:07, 3.64s/it]Epoch 607 | Step 8499/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 607 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 609/1500 [36:57<53:20, 3.59s/it]Epoch 608 | Step 8513/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 608 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 610/1500 [37:00<53:27, 3.60s/it]Epoch 609 | Step 8527/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 609 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 611/1500 [37:04<53:05, 3.58s/it]Epoch 610 | Step 8541/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 610 | Avg Loss: nan | LR: 1.43e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 612/1500 [37:08<56:38, 3.83s/it]Epoch 611 | Step 8555/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 611 | Avg Loss: nan | LR: 1.43e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 613/1500 [37:13<59:28, 4.02s/it]Epoch 612 | Step 8569/ 21000 | Loss: nan | LR: 1.43e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 612 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 614/1500 [37:17<1:01:01, 4.13s/it]Epoch 613 | Step 8583/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 613 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 615/1500 [37:22<1:02:24, 4.23s/it]Epoch 614 | Step 8597/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 614 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 616/1500 [37:26<1:02:57, 4.27s/it]Epoch 615 | Step 8611/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 615 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 617/1500 [37:30<1:03:08, 4.29s/it]Epoch 616 | Step 8625/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 616 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 41%|ββββ | 618/1500 [37:35<1:03:58, 4.35s/it]Epoch 617 | Step 8639/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 617 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 41%|βββββ | 619/1500 [37:39<1:04:24, 4.39s/it]Epoch 618 | Step 8653/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 618 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 41%|βββββ | 620/1500 [37:44<1:05:15, 4.45s/it]Epoch 619 | Step 8667/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 619 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 41%|βββββ | 621/1500 [37:48<1:05:11, 4.45s/it]Epoch 620 | Step 8681/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 620 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 41%|βββββ | 622/1500 [37:53<1:05:25, 4.47s/it]Epoch 621 | Step 8695/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 621 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 623/1500 [37:57<1:04:56, 4.44s/it]Epoch 622 | Step 8709/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 622 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 624/1500 [38:02<1:04:40, 4.43s/it]Epoch 623 | Step 8723/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 623 | Avg Loss: nan | LR: 1.42e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 625/1500 [38:06<1:04:19, 4.41s/it]Epoch 624 | Step 8737/ 21000 | Loss: nan | LR: 1.42e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 624 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 626/1500 [38:10<1:04:37, 4.44s/it]Epoch 625 | Step 8751/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 625 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 627/1500 [38:15<1:05:12, 4.48s/it]Epoch 626 | Step 8765/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 626 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 628/1500 [38:19<1:04:21, 4.43s/it]Epoch 627 | Step 8779/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 627 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 629/1500 [38:24<1:04:31, 4.45s/it]Epoch 628 | Step 8793/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 628 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 630/1500 [38:28<1:04:52, 4.47s/it]Epoch 629 | Step 8807/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 629 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 631/1500 [38:33<1:04:59, 4.49s/it]Epoch 630 | Step 8821/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 630 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 632/1500 [38:37<1:05:19, 4.52s/it]Epoch 631 | Step 8835/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 631 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 633/1500 [38:42<1:04:23, 4.46s/it]Epoch 632 | Step 8849/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 632 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 634/1500 [38:46<1:04:06, 4.44s/it]Epoch 633 | Step 8863/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 633 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 635/1500 [38:50<1:03:14, 4.39s/it]Epoch 634 | Step 8877/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 634 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 636/1500 [38:55<1:03:29, 4.41s/it]Epoch 635 | Step 8891/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 635 | Avg Loss: nan | LR: 1.41e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 42%|βββββ | 637/1500 [39:00<1:04:20, 4.47s/it]Epoch 636 | Step 8905/ 21000 | Loss: nan | LR: 1.41e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 636 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 638/1500 [39:04<1:03:32, 4.42s/it]Epoch 637 | Step 8919/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 637 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 639/1500 [39:07<59:44, 4.16s/it] Epoch 638 | Step 8933/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 638 | Avg Loss: nan | LR: 1.40e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 640/1500 [39:11<56:40, 3.95s/it]Epoch 639 | Step 8947/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 639 | Avg Loss: nan | LR: 1.40e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 641/1500 [39:15<55:33, 3.88s/it]Epoch 640 | Step 8961/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 640 | Avg Loss: nan | LR: 1.40e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 642/1500 [39:18<53:59, 3.78s/it]Epoch 641 | Step 8975/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 641 | Avg Loss: nan | LR: 1.40e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 643/1500 [39:22<53:41, 3.76s/it]Epoch 642 | Step 8989/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 642 | Avg Loss: nan | LR: 1.40e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 644/1500 [39:27<59:37, 4.18s/it]Epoch 643 | Step 9003/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 643 | Avg Loss: nan | LR: 1.40e-03 | Time: 5.2s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 645/1500 [39:32<1:01:19, 4.30s/it]Epoch 644 | Step 9017/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 644 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 646/1500 [39:36<1:01:49, 4.34s/it]Epoch 645 | Step 9031/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 645 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 647/1500 [39:40<1:01:49, 4.35s/it]Epoch 646 | Step 9045/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 646 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 648/1500 [39:45<1:01:52, 4.36s/it]Epoch 647 | Step 9059/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 647 | Avg Loss: nan | LR: 1.40e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 649/1500 [39:49<1:02:29, 4.41s/it]Epoch 648 | Step 9073/ 21000 | Loss: nan | LR: 1.40e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 648 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 650/1500 [39:54<1:03:51, 4.51s/it]Epoch 649 | Step 9087/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 649 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 651/1500 [39:59<1:05:11, 4.61s/it]Epoch 650 | Step 9101/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 650 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 43%|βββββ | 652/1500 [40:04<1:06:31, 4.71s/it]Epoch 651 | Step 9115/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 651 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 653/1500 [40:08<1:05:53, 4.67s/it]Epoch 652 | Step 9129/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 652 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 654/1500 [40:13<1:05:13, 4.63s/it]Epoch 653 | Step 9143/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 653 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 655/1500 [40:17<1:04:27, 4.58s/it]Epoch 654 | Step 9157/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 654 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 656/1500 [40:22<1:03:41, 4.53s/it]Epoch 655 | Step 9171/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 655 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 657/1500 [40:27<1:05:07, 4.64s/it]Epoch 656 | Step 9185/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 656 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 658/1500 [40:31<1:05:42, 4.68s/it]Epoch 657 | Step 9199/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 657 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 659/1500 [40:36<1:05:42, 4.69s/it]Epoch 658 | Step 9213/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 658 | Avg Loss: nan | LR: 1.39e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 660/1500 [40:42<1:09:45, 4.98s/it]Epoch 659 | Step 9227/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 659 | Avg Loss: nan | LR: 1.39e-03 | Time: 5.7s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 661/1500 [40:47<1:08:35, 4.91s/it]Epoch 660 | Step 9241/ 21000 | Loss: nan | LR: 1.39e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 660 | Avg Loss: nan | LR: 1.38e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 662/1500 [40:51<1:05:48, 4.71s/it]Epoch 661 | Step 9255/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 661 | Avg Loss: nan | LR: 1.38e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 663/1500 [40:54<1:01:04, 4.38s/it]Epoch 662 | Step 9269/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 662 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 664/1500 [40:58<58:14, 4.18s/it] Epoch 663 | Step 9283/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 663 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 665/1500 [41:02<55:04, 3.96s/it]Epoch 664 | Step 9297/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 664 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 666/1500 [41:05<53:42, 3.86s/it]Epoch 665 | Step 9311/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 665 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 44%|βββββ | 667/1500 [41:09<52:35, 3.79s/it]Epoch 666 | Step 9325/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 666 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 668/1500 [41:12<51:19, 3.70s/it]Epoch 667 | Step 9339/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 667 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 669/1500 [41:16<50:48, 3.67s/it]Epoch 668 | Step 9353/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 668 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 670/1500 [41:19<50:17, 3.64s/it]Epoch 669 | Step 9367/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 669 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 671/1500 [41:23<49:28, 3.58s/it]Epoch 670 | Step 9381/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 670 | Avg Loss: nan | LR: 1.38e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 672/1500 [41:26<48:40, 3.53s/it]Epoch 671 | Step 9395/ 21000 | Loss: nan | LR: 1.38e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 671 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 673/1500 [41:30<48:13, 3.50s/it]Epoch 672 | Step 9409/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 672 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 674/1500 [41:33<47:44, 3.47s/it]Epoch 673 | Step 9423/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.9h |
| Epoch 673 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 675/1500 [41:37<48:38, 3.54s/it]Epoch 674 | Step 9437/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 674 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 676/1500 [41:40<47:53, 3.49s/it]Epoch 675 | Step 9451/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 675 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 677/1500 [41:44<48:05, 3.51s/it]Epoch 676 | Step 9465/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 676 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 678/1500 [41:47<48:27, 3.54s/it]Epoch 677 | Step 9479/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 677 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 679/1500 [41:51<48:50, 3.57s/it]Epoch 678 | Step 9493/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 678 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 680/1500 [41:54<48:07, 3.52s/it]Epoch 679 | Step 9507/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 679 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 681/1500 [41:58<47:50, 3.50s/it]Epoch 680 | Step 9521/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 680 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 45%|βββββ | 682/1500 [42:01<47:31, 3.49s/it]Epoch 681 | Step 9535/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 681 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 683/1500 [42:05<47:45, 3.51s/it]Epoch 682 | Step 9549/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 682 | Avg Loss: nan | LR: 1.37e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 684/1500 [42:08<47:37, 3.50s/it]Epoch 683 | Step 9563/ 21000 | Loss: nan | LR: 1.37e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 683 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 685/1500 [42:12<47:43, 3.51s/it]Epoch 684 | Step 9577/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 684 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 686/1500 [42:16<48:07, 3.55s/it]Epoch 685 | Step 9591/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 685 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 687/1500 [42:19<47:51, 3.53s/it]Epoch 686 | Step 9605/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 686 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 688/1500 [42:23<47:50, 3.54s/it]Epoch 687 | Step 9619/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 687 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 689/1500 [42:26<48:21, 3.58s/it]Epoch 688 | Step 9633/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 688 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 690/1500 [42:30<47:56, 3.55s/it]Epoch 689 | Step 9647/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 689 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 691/1500 [42:33<47:17, 3.51s/it]Epoch 690 | Step 9661/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 690 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 692/1500 [42:37<46:52, 3.48s/it]Epoch 691 | Step 9675/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 691 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 693/1500 [42:40<46:58, 3.49s/it]Epoch 692 | Step 9689/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 692 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 694/1500 [42:44<46:31, 3.46s/it]Epoch 693 | Step 9703/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 693 | Avg Loss: nan | LR: 1.36e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 695/1500 [42:47<46:29, 3.47s/it]Epoch 694 | Step 9717/ 21000 | Loss: nan | LR: 1.36e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 694 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 696/1500 [42:51<46:58, 3.51s/it]Epoch 695 | Step 9731/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 695 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 46%|βββββ | 697/1500 [42:54<46:28, 3.47s/it]Epoch 696 | Step 9745/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 696 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 698/1500 [42:57<46:14, 3.46s/it]Epoch 697 | Step 9759/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 697 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 699/1500 [43:01<46:23, 3.48s/it]Epoch 698 | Step 9773/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 698 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 700/1500 [43:04<46:09, 3.46s/it]Epoch 699 | Step 9787/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 699 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 701/1500 [43:08<46:46, 3.51s/it]Epoch 700 | Step 9801/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 700 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 702/1500 [43:12<46:54, 3.53s/it]Epoch 701 | Step 9815/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 701 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 703/1500 [43:15<47:00, 3.54s/it]Epoch 702 | Step 9829/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 702 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 704/1500 [43:19<46:50, 3.53s/it]Epoch 703 | Step 9843/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 703 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 705/1500 [43:22<47:36, 3.59s/it]Epoch 704 | Step 9857/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 704 | Avg Loss: nan | LR: 1.35e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 706/1500 [43:26<47:10, 3.56s/it]Epoch 705 | Step 9871/ 21000 | Loss: nan | LR: 1.35e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 705 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 707/1500 [43:29<46:24, 3.51s/it]Epoch 706 | Step 9885/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 706 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 708/1500 [43:33<46:10, 3.50s/it]Epoch 707 | Step 9899/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 707 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 709/1500 [43:36<45:23, 3.44s/it]Epoch 708 | Step 9913/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 708 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 710/1500 [43:40<46:04, 3.50s/it]Epoch 709 | Step 9927/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 709 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 711/1500 [43:43<46:12, 3.51s/it]Epoch 710 | Step 9941/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 710 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 47%|βββββ | 712/1500 [43:47<45:46, 3.48s/it]Epoch 711 | Step 9955/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 711 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 713/1500 [43:50<46:07, 3.52s/it]Epoch 712 | Step 9969/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 712 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 714/1500 [43:54<46:31, 3.55s/it]Epoch 713 | Step 9983/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 713 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.6s | Samples: 6,983 |
| /data2/edwardsun/flow_home/cfg_dataset.py:360: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). |
| 'index': torch.tensor(idx, dtype=torch.long) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 48%|βββββ | 715/1500 [43:59<50:45, 3.88s/it]Epoch 714 | Step 9997/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Validation at step 10000: Loss = nan |
| Epoch 714 | Avg Loss: nan | LR: 1.34e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 716/1500 [44:02<48:40, 3.73s/it]Epoch 715 | Step 10011/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 715 | Avg Loss: nan | LR: 1.34e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 717/1500 [44:05<47:30, 3.64s/it]Epoch 716 | Step 10025/ 21000 | Loss: nan | LR: 1.34e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 716 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 718/1500 [44:09<46:48, 3.59s/it]Epoch 717 | Step 10039/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 717 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 719/1500 [44:12<46:39, 3.58s/it]Epoch 718 | Step 10053/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 718 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 720/1500 [44:16<45:58, 3.54s/it]Epoch 719 | Step 10067/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 719 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 721/1500 [44:19<46:01, 3.54s/it]Epoch 720 | Step 10081/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 720 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 722/1500 [44:23<45:33, 3.51s/it]Epoch 721 | Step 10095/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 721 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 723/1500 [44:26<45:18, 3.50s/it]Epoch 722 | Step 10109/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 722 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 724/1500 [44:30<45:33, 3.52s/it]Epoch 723 | Step 10123/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 723 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 725/1500 [44:33<45:28, 3.52s/it]Epoch 724 | Step 10137/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 724 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 726/1500 [44:37<45:17, 3.51s/it]Epoch 725 | Step 10151/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 725 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 48%|βββββ | 727/1500 [44:40<45:22, 3.52s/it]Epoch 726 | Step 10165/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 726 | Avg Loss: nan | LR: 1.33e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 728/1500 [44:44<45:11, 3.51s/it]Epoch 727 | Step 10179/ 21000 | Loss: nan | LR: 1.33e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 727 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 729/1500 [44:47<44:43, 3.48s/it]Epoch 728 | Step 10193/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 728 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 730/1500 [44:51<44:51, 3.50s/it]Epoch 729 | Step 10207/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 729 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 731/1500 [44:54<45:08, 3.52s/it]Epoch 730 | Step 10221/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 730 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 732/1500 [44:58<45:22, 3.55s/it]Epoch 731 | Step 10235/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 731 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 733/1500 [45:01<45:02, 3.52s/it]Epoch 732 | Step 10249/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 732 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 734/1500 [45:05<44:57, 3.52s/it]Epoch 733 | Step 10263/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 733 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 735/1500 [45:08<44:12, 3.47s/it]Epoch 734 | Step 10277/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 734 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 736/1500 [45:12<44:04, 3.46s/it]Epoch 735 | Step 10291/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 735 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 737/1500 [45:15<43:45, 3.44s/it]Epoch 736 | Step 10305/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 736 | Avg Loss: nan | LR: 1.32e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 738/1500 [45:19<44:41, 3.52s/it]Epoch 737 | Step 10319/ 21000 | Loss: nan | LR: 1.32e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 737 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 739/1500 [45:22<44:42, 3.53s/it]Epoch 738 | Step 10333/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 738 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 740/1500 [45:26<44:03, 3.48s/it]Epoch 739 | Step 10347/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 739 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 741/1500 [45:29<44:13, 3.50s/it]Epoch 740 | Step 10361/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 740 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 49%|βββββ | 742/1500 [45:33<43:57, 3.48s/it]Epoch 741 | Step 10375/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 741 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 743/1500 [45:36<43:47, 3.47s/it]Epoch 742 | Step 10389/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 742 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 744/1500 [45:40<44:02, 3.50s/it]Epoch 743 | Step 10403/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 743 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 745/1500 [45:43<44:21, 3.52s/it]Epoch 744 | Step 10417/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 744 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 746/1500 [45:47<44:45, 3.56s/it]Epoch 745 | Step 10431/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 745 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 747/1500 [45:51<44:47, 3.57s/it]Epoch 746 | Step 10445/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 746 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 748/1500 [45:54<44:13, 3.53s/it]Epoch 747 | Step 10459/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 747 | Avg Loss: nan | LR: 1.31e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 749/1500 [45:57<43:43, 3.49s/it]Epoch 748 | Step 10473/ 21000 | Loss: nan | LR: 1.31e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 748 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 750/1500 [46:01<44:00, 3.52s/it]Epoch 749 | Step 10487/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 749 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 751/1500 [46:04<43:44, 3.50s/it]Epoch 750 | Step 10501/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 750 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 752/1500 [46:08<43:32, 3.49s/it]Epoch 751 | Step 10515/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 751 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 753/1500 [46:12<43:53, 3.53s/it]Epoch 752 | Step 10529/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 752 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 754/1500 [46:15<43:51, 3.53s/it]Epoch 753 | Step 10543/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 753 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 755/1500 [46:19<44:04, 3.55s/it]Epoch 754 | Step 10557/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 754 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 756/1500 [46:22<43:41, 3.52s/it]Epoch 755 | Step 10571/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 755 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 50%|βββββ | 757/1500 [46:26<43:07, 3.48s/it]Epoch 756 | Step 10585/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 756 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 758/1500 [46:29<42:56, 3.47s/it]Epoch 757 | Step 10599/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 757 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 759/1500 [46:32<42:40, 3.45s/it]Epoch 758 | Step 10613/ 21000 | Loss: nan | LR: 1.30e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 758 | Avg Loss: nan | LR: 1.30e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 760/1500 [46:36<42:59, 3.49s/it]Epoch 759 | Step 10627/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 759 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 761/1500 [46:39<42:47, 3.47s/it]Epoch 760 | Step 10641/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 760 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 762/1500 [46:43<42:42, 3.47s/it]Epoch 761 | Step 10655/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 761 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 763/1500 [46:46<42:34, 3.47s/it]Epoch 762 | Step 10669/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 762 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 764/1500 [46:50<42:33, 3.47s/it]Epoch 763 | Step 10683/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 763 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 765/1500 [46:53<42:11, 3.44s/it]Epoch 764 | Step 10697/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 764 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 766/1500 [46:57<42:07, 3.44s/it]Epoch 765 | Step 10711/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 765 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 767/1500 [47:00<42:11, 3.45s/it]Epoch 766 | Step 10725/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.8h |
| Epoch 766 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|βββββ | 768/1500 [47:04<42:24, 3.48s/it]Epoch 767 | Step 10739/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 767 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|ββββββ | 769/1500 [47:07<42:35, 3.50s/it]Epoch 768 | Step 10753/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 768 | Avg Loss: nan | LR: 1.29e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|ββββββ | 770/1500 [47:11<42:24, 3.49s/it]Epoch 769 | Step 10767/ 21000 | Loss: nan | LR: 1.29e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 769 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|ββββββ | 771/1500 [47:14<42:14, 3.48s/it]Epoch 770 | Step 10781/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 770 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 51%|ββββββ | 772/1500 [47:18<42:07, 3.47s/it]Epoch 771 | Step 10795/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 771 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 773/1500 [47:21<42:16, 3.49s/it]Epoch 772 | Step 10809/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 772 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 774/1500 [47:25<42:41, 3.53s/it]Epoch 773 | Step 10823/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 773 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 775/1500 [47:28<42:22, 3.51s/it]Epoch 774 | Step 10837/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 774 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 776/1500 [47:32<42:27, 3.52s/it]Epoch 775 | Step 10851/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 775 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 777/1500 [47:35<42:26, 3.52s/it]Epoch 776 | Step 10865/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 776 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 778/1500 [47:39<42:42, 3.55s/it]Epoch 777 | Step 10879/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 777 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 779/1500 [47:42<42:52, 3.57s/it]Epoch 778 | Step 10893/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 778 | Avg Loss: nan | LR: 1.28e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 780/1500 [47:46<43:07, 3.59s/it]Epoch 779 | Step 10907/ 21000 | Loss: nan | LR: 1.28e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 779 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 781/1500 [47:50<42:48, 3.57s/it]Epoch 780 | Step 10921/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 780 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 782/1500 [47:53<42:09, 3.52s/it]Epoch 781 | Step 10935/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 781 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 783/1500 [47:57<41:57, 3.51s/it]Epoch 782 | Step 10949/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 782 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 784/1500 [48:00<42:26, 3.56s/it]Epoch 783 | Step 10963/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 783 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 785/1500 [48:04<41:49, 3.51s/it]Epoch 784 | Step 10977/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 784 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 786/1500 [48:07<42:03, 3.53s/it]Epoch 785 | Step 10991/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 785 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 52%|ββββββ | 787/1500 [48:11<41:29, 3.49s/it]Epoch 786 | Step 11005/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 786 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 788/1500 [48:14<41:24, 3.49s/it]Epoch 787 | Step 11019/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 787 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 789/1500 [48:17<41:11, 3.48s/it]Epoch 788 | Step 11033/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 788 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 790/1500 [48:21<41:21, 3.49s/it]Epoch 789 | Step 11047/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 789 | Avg Loss: nan | LR: 1.27e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 791/1500 [48:24<40:57, 3.47s/it]Epoch 790 | Step 11061/ 21000 | Loss: nan | LR: 1.27e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 790 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 792/1500 [48:28<40:56, 3.47s/it]Epoch 791 | Step 11075/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 791 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 793/1500 [48:31<41:16, 3.50s/it]Epoch 792 | Step 11089/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 792 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 794/1500 [48:35<41:24, 3.52s/it]Epoch 793 | Step 11103/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 793 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 795/1500 [48:38<40:54, 3.48s/it]Epoch 794 | Step 11117/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 794 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 796/1500 [48:42<40:41, 3.47s/it]Epoch 795 | Step 11131/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 795 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 797/1500 [48:45<40:34, 3.46s/it]Epoch 796 | Step 11145/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 796 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 798/1500 [48:49<40:31, 3.46s/it]Epoch 797 | Step 11159/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 797 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 799/1500 [48:52<40:18, 3.45s/it]Epoch 798 | Step 11173/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 798 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 800/1500 [48:56<40:15, 3.45s/it]Epoch 799 | Step 11187/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 799 | Avg Loss: nan | LR: 1.26e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 801/1500 [48:59<40:18, 3.46s/it]Epoch 800 | Step 11201/ 21000 | Loss: nan | LR: 1.26e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 800 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 53%|ββββββ | 802/1500 [49:03<40:39, 3.49s/it]Epoch 801 | Step 11215/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 801 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 803/1500 [49:06<40:54, 3.52s/it]Epoch 802 | Step 11229/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 802 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 804/1500 [49:10<40:41, 3.51s/it]Epoch 803 | Step 11243/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 803 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 805/1500 [49:13<40:09, 3.47s/it]Epoch 804 | Step 11257/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 804 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 806/1500 [49:17<39:55, 3.45s/it]Epoch 805 | Step 11271/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 805 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 807/1500 [49:20<40:08, 3.48s/it]Epoch 806 | Step 11285/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 806 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 808/1500 [49:24<40:31, 3.51s/it]Epoch 807 | Step 11299/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 807 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 809/1500 [49:27<40:27, 3.51s/it]Epoch 808 | Step 11313/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 808 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 810/1500 [49:31<40:17, 3.50s/it]Epoch 809 | Step 11327/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 809 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 811/1500 [49:34<40:21, 3.51s/it]Epoch 810 | Step 11341/ 21000 | Loss: nan | LR: 1.25e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 810 | Avg Loss: nan | LR: 1.25e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 812/1500 [49:38<40:05, 3.50s/it]Epoch 811 | Step 11355/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 811 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 813/1500 [49:41<40:01, 3.50s/it]Epoch 812 | Step 11369/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 812 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 814/1500 [49:45<40:14, 3.52s/it]Epoch 813 | Step 11383/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 813 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 815/1500 [49:48<39:54, 3.50s/it]Epoch 814 | Step 11397/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 814 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 816/1500 [49:52<39:49, 3.49s/it]Epoch 815 | Step 11411/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 815 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 54%|ββββββ | 817/1500 [49:55<39:48, 3.50s/it]Epoch 816 | Step 11425/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 816 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 818/1500 [49:59<39:31, 3.48s/it]Epoch 817 | Step 11439/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 817 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 819/1500 [50:02<39:26, 3.47s/it]Epoch 818 | Step 11453/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 818 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 820/1500 [50:06<39:09, 3.46s/it]Epoch 819 | Step 11467/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 819 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 821/1500 [50:09<39:37, 3.50s/it]Epoch 820 | Step 11481/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 820 | Avg Loss: nan | LR: 1.24e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 822/1500 [50:13<40:34, 3.59s/it]Epoch 821 | Step 11495/ 21000 | Loss: nan | LR: 1.24e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 821 | Avg Loss: nan | LR: 1.23e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 823/1500 [50:17<40:48, 3.62s/it]Epoch 822 | Step 11509/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 822 | Avg Loss: nan | LR: 1.23e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 824/1500 [50:20<40:39, 3.61s/it]Epoch 823 | Step 11523/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 823 | Avg Loss: nan | LR: 1.23e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 825/1500 [50:24<40:49, 3.63s/it]Epoch 824 | Step 11537/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 824 | Avg Loss: nan | LR: 1.23e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 826/1500 [50:27<40:44, 3.63s/it]Epoch 825 | Step 11551/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 825 | Avg Loss: nan | LR: 1.23e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 827/1500 [50:32<43:15, 3.86s/it]Epoch 826 | Step 11565/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 826 | Avg Loss: nan | LR: 1.23e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 828/1500 [50:36<44:32, 3.98s/it]Epoch 827 | Step 11579/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 827 | Avg Loss: nan | LR: 1.23e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 829/1500 [50:41<46:05, 4.12s/it]Epoch 828 | Step 11593/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 828 | Avg Loss: nan | LR: 1.23e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 830/1500 [50:45<47:57, 4.30s/it]Epoch 829 | Step 11607/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 829 | Avg Loss: nan | LR: 1.23e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 831/1500 [50:50<48:32, 4.35s/it]Epoch 830 | Step 11621/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 830 | Avg Loss: nan | LR: 1.23e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 55%|ββββββ | 832/1500 [50:54<48:58, 4.40s/it]Epoch 831 | Step 11635/ 21000 | Loss: nan | LR: 1.23e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 831 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 833/1500 [50:59<48:43, 4.38s/it]Epoch 832 | Step 11649/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 832 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 834/1500 [51:03<48:56, 4.41s/it]Epoch 833 | Step 11663/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 833 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 835/1500 [51:08<49:00, 4.42s/it]Epoch 834 | Step 11677/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 834 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 836/1500 [51:12<48:59, 4.43s/it]Epoch 835 | Step 11691/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 835 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 837/1500 [51:16<48:52, 4.42s/it]Epoch 836 | Step 11705/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 836 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 838/1500 [51:21<48:42, 4.41s/it]Epoch 837 | Step 11719/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 837 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 839/1500 [51:25<49:10, 4.46s/it]Epoch 838 | Step 11733/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 838 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 840/1500 [51:30<49:09, 4.47s/it]Epoch 839 | Step 11747/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 839 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 841/1500 [51:34<48:56, 4.46s/it]Epoch 840 | Step 11761/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 840 | Avg Loss: nan | LR: 1.22e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 842/1500 [51:39<48:37, 4.43s/it]Epoch 841 | Step 11775/ 21000 | Loss: nan | LR: 1.22e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 841 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 843/1500 [51:43<48:34, 4.44s/it]Epoch 842 | Step 11789/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 842 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 844/1500 [51:48<48:43, 4.46s/it]Epoch 843 | Step 11803/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 843 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 845/1500 [51:52<48:12, 4.42s/it]Epoch 844 | Step 11817/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 844 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 846/1500 [51:57<49:04, 4.50s/it]Epoch 845 | Step 11831/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 845 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 56%|ββββββ | 847/1500 [52:01<49:17, 4.53s/it]Epoch 846 | Step 11845/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 846 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 848/1500 [52:06<48:34, 4.47s/it]Epoch 847 | Step 11859/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 847 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 849/1500 [52:10<48:06, 4.43s/it]Epoch 848 | Step 11873/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 848 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 850/1500 [52:14<48:16, 4.46s/it]Epoch 849 | Step 11887/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 849 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 851/1500 [52:19<48:57, 4.53s/it]Epoch 850 | Step 11901/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 850 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 852/1500 [52:24<48:45, 4.51s/it]Epoch 851 | Step 11915/ 21000 | Loss: nan | LR: 1.21e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 851 | Avg Loss: nan | LR: 1.21e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 853/1500 [52:28<48:51, 4.53s/it]Epoch 852 | Step 11929/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 852 | Avg Loss: nan | LR: 1.20e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 854/1500 [52:33<48:36, 4.51s/it]Epoch 853 | Step 11943/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 853 | Avg Loss: nan | LR: 1.20e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 855/1500 [52:36<45:21, 4.22s/it]Epoch 854 | Step 11957/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 854 | Avg Loss: nan | LR: 1.20e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 856/1500 [52:40<43:19, 4.04s/it]Epoch 855 | Step 11971/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 855 | Avg Loss: nan | LR: 1.20e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 857/1500 [52:43<41:32, 3.88s/it]Epoch 856 | Step 11985/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 856 | Avg Loss: nan | LR: 1.20e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 858/1500 [52:47<40:13, 3.76s/it]Epoch 857 | Step 11999/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 857 | Avg Loss: nan | LR: 1.20e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 859/1500 [52:50<39:33, 3.70s/it]Epoch 858 | Step 12013/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 858 | Avg Loss: nan | LR: 1.20e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 860/1500 [52:55<43:40, 4.09s/it]Epoch 859 | Step 12027/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 859 | Avg Loss: nan | LR: 1.20e-03 | Time: 5.0s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 861/1500 [53:00<44:44, 4.20s/it]Epoch 860 | Step 12041/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 860 | Avg Loss: nan | LR: 1.20e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 57%|ββββββ | 862/1500 [53:04<45:38, 4.29s/it]Epoch 861 | Step 12055/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 861 | Avg Loss: nan | LR: 1.20e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 863/1500 [53:09<47:23, 4.46s/it]Epoch 862 | Step 12069/ 21000 | Loss: nan | LR: 1.20e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 862 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 864/1500 [53:14<48:15, 4.55s/it]Epoch 863 | Step 12083/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 863 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 865/1500 [53:18<47:59, 4.53s/it]Epoch 864 | Step 12097/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 864 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 866/1500 [53:23<48:28, 4.59s/it]Epoch 865 | Step 12111/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 865 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 867/1500 [53:28<49:16, 4.67s/it]Epoch 866 | Step 12125/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 866 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 868/1500 [53:33<49:21, 4.69s/it]Epoch 867 | Step 12139/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 867 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 869/1500 [53:37<48:20, 4.60s/it]Epoch 868 | Step 12153/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.7h |
| Epoch 868 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 870/1500 [53:42<48:57, 4.66s/it]Epoch 869 | Step 12167/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 869 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 871/1500 [53:47<49:21, 4.71s/it]Epoch 870 | Step 12181/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 870 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 872/1500 [53:51<48:35, 4.64s/it]Epoch 871 | Step 12195/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 871 | Avg Loss: nan | LR: 1.19e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 873/1500 [53:56<48:09, 4.61s/it]Epoch 872 | Step 12209/ 21000 | Loss: nan | LR: 1.19e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 872 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 874/1500 [54:01<49:19, 4.73s/it]Epoch 873 | Step 12223/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 873 | Avg Loss: nan | LR: 1.18e-03 | Time: 5.0s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 875/1500 [54:05<49:09, 4.72s/it]Epoch 874 | Step 12237/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 874 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 876/1500 [54:10<47:47, 4.59s/it]Epoch 875 | Step 12251/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 875 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 58%|ββββββ | 877/1500 [54:15<48:14, 4.65s/it]Epoch 876 | Step 12265/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 876 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 878/1500 [54:19<49:09, 4.74s/it]Epoch 877 | Step 12279/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 877 | Avg Loss: nan | LR: 1.18e-03 | Time: 5.0s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 879/1500 [54:24<48:59, 4.73s/it]Epoch 878 | Step 12293/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 878 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 880/1500 [54:29<48:19, 4.68s/it]Epoch 879 | Step 12307/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 879 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 881/1500 [54:34<48:39, 4.72s/it]Epoch 880 | Step 12321/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 880 | Avg Loss: nan | LR: 1.18e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 882/1500 [54:37<44:44, 4.34s/it]Epoch 881 | Step 12335/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 881 | Avg Loss: nan | LR: 1.18e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 883/1500 [54:41<43:03, 4.19s/it]Epoch 882 | Step 12349/ 21000 | Loss: nan | LR: 1.18e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 882 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 884/1500 [54:44<41:00, 3.99s/it]Epoch 883 | Step 12363/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 883 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 885/1500 [54:48<39:36, 3.86s/it]Epoch 884 | Step 12377/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 884 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 886/1500 [54:52<38:45, 3.79s/it]Epoch 885 | Step 12391/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 885 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 887/1500 [54:55<38:04, 3.73s/it]Epoch 886 | Step 12405/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 886 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 888/1500 [54:59<37:47, 3.70s/it]Epoch 887 | Step 12419/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 887 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 889/1500 [55:02<36:46, 3.61s/it]Epoch 888 | Step 12433/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 888 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 890/1500 [55:06<36:04, 3.55s/it]Epoch 889 | Step 12447/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 889 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 891/1500 [55:09<35:36, 3.51s/it]Epoch 890 | Step 12461/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 890 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 59%|ββββββ | 892/1500 [55:13<36:10, 3.57s/it]Epoch 891 | Step 12475/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 891 | Avg Loss: nan | LR: 1.17e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 893/1500 [55:16<36:24, 3.60s/it]Epoch 892 | Step 12489/ 21000 | Loss: nan | LR: 1.17e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 892 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 894/1500 [55:20<36:03, 3.57s/it]Epoch 893 | Step 12503/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 893 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 895/1500 [55:23<35:29, 3.52s/it]Epoch 894 | Step 12517/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 894 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 896/1500 [55:27<35:09, 3.49s/it]Epoch 895 | Step 12531/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 895 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 897/1500 [55:30<34:58, 3.48s/it]Epoch 896 | Step 12545/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 896 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 898/1500 [55:34<34:34, 3.45s/it]Epoch 897 | Step 12559/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 897 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 899/1500 [55:37<34:28, 3.44s/it]Epoch 898 | Step 12573/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 898 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 900/1500 [55:44<44:57, 4.50s/it]Epoch 899 | Step 12587/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 899 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.5s | Samples: 6,983 |
| β Checkpoint saved: /data2/edwardsun/flow_checkpoints/amp_flow_model_final_optimized.pth (loss: nan, step: 12600) |
|
Training Flow Model: 60%|ββββββ | 901/1500 [55:47<41:42, 4.18s/it]Epoch 900 | Step 12601/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 900 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 902/1500 [55:51<39:41, 3.98s/it]Epoch 901 | Step 12615/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 901 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 903/1500 [55:54<37:49, 3.80s/it]Epoch 902 | Step 12629/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 902 | Avg Loss: nan | LR: 1.16e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 904/1500 [55:58<36:48, 3.71s/it]Epoch 903 | Step 12643/ 21000 | Loss: nan | LR: 1.16e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 903 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 905/1500 [56:01<35:57, 3.63s/it]Epoch 904 | Step 12657/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 904 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 906/1500 [56:05<35:25, 3.58s/it]Epoch 905 | Step 12671/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 905 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 60%|ββββββ | 907/1500 [56:08<35:14, 3.57s/it]Epoch 906 | Step 12685/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 906 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 908/1500 [56:12<34:57, 3.54s/it]Epoch 907 | Step 12699/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 907 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 909/1500 [56:15<34:49, 3.54s/it]Epoch 908 | Step 12713/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 908 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 910/1500 [56:19<34:54, 3.55s/it]Epoch 909 | Step 12727/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 909 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 911/1500 [56:22<35:00, 3.57s/it]Epoch 910 | Step 12741/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 910 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 912/1500 [56:26<34:27, 3.52s/it]Epoch 911 | Step 12755/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 911 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 913/1500 [56:30<35:06, 3.59s/it]Epoch 912 | Step 12769/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 912 | Avg Loss: nan | LR: 1.15e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 914/1500 [56:33<34:53, 3.57s/it]Epoch 913 | Step 12783/ 21000 | Loss: nan | LR: 1.15e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 913 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 915/1500 [56:37<34:46, 3.57s/it]Epoch 914 | Step 12797/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 914 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 916/1500 [56:40<34:45, 3.57s/it]Epoch 915 | Step 12811/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 915 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 917/1500 [56:44<34:12, 3.52s/it]Epoch 916 | Step 12825/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 916 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 61%|ββββββ | 918/1500 [56:47<34:07, 3.52s/it]Epoch 917 | Step 12839/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 917 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|βββββββ | 919/1500 [56:51<33:50, 3.49s/it]Epoch 918 | Step 12853/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 918 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 61%|βββββββ | 920/1500 [56:54<33:43, 3.49s/it]Epoch 919 | Step 12867/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 919 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|βββββββ | 921/1500 [56:58<33:46, 3.50s/it]Epoch 920 | Step 12881/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 920 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 61%|βββββββ | 922/1500 [57:01<33:51, 3.51s/it]Epoch 921 | Step 12895/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 921 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 923/1500 [57:05<34:05, 3.54s/it]Epoch 922 | Step 12909/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 922 | Avg Loss: nan | LR: 1.14e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 924/1500 [57:08<34:01, 3.54s/it]Epoch 923 | Step 12923/ 21000 | Loss: nan | LR: 1.14e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 923 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 925/1500 [57:12<34:03, 3.55s/it]Epoch 924 | Step 12937/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 924 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 926/1500 [57:15<33:49, 3.54s/it]Epoch 925 | Step 12951/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 925 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 927/1500 [57:19<33:40, 3.53s/it]Epoch 926 | Step 12965/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 926 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 928/1500 [57:22<33:28, 3.51s/it]Epoch 927 | Step 12979/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 927 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 929/1500 [57:26<33:12, 3.49s/it]Epoch 928 | Step 12993/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 928 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 930/1500 [57:29<32:54, 3.46s/it]Epoch 929 | Step 13007/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 929 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 931/1500 [57:33<33:19, 3.51s/it]Epoch 930 | Step 13021/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 930 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 932/1500 [57:36<33:30, 3.54s/it]Epoch 931 | Step 13035/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 931 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 933/1500 [57:40<33:19, 3.53s/it]Epoch 932 | Step 13049/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 932 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 934/1500 [57:43<33:27, 3.55s/it]Epoch 933 | Step 13063/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 933 | Avg Loss: nan | LR: 1.13e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 935/1500 [57:47<33:27, 3.55s/it]Epoch 934 | Step 13077/ 21000 | Loss: nan | LR: 1.13e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 934 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 936/1500 [57:51<33:19, 3.54s/it]Epoch 935 | Step 13091/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 935 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 62%|βββββββ | 937/1500 [57:54<33:20, 3.55s/it]Epoch 936 | Step 13105/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 936 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 938/1500 [57:58<33:03, 3.53s/it]Epoch 937 | Step 13119/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 937 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 939/1500 [58:01<33:00, 3.53s/it]Epoch 938 | Step 13133/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 938 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 940/1500 [58:05<32:54, 3.53s/it]Epoch 939 | Step 13147/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 939 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 941/1500 [58:08<33:20, 3.58s/it]Epoch 940 | Step 13161/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 940 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 942/1500 [58:12<33:03, 3.56s/it]Epoch 941 | Step 13175/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 941 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 943/1500 [58:15<32:54, 3.55s/it]Epoch 942 | Step 13189/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 942 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 944/1500 [58:19<32:45, 3.54s/it]Epoch 943 | Step 13203/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 943 | Avg Loss: nan | LR: 1.12e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 945/1500 [58:23<32:59, 3.57s/it]Epoch 944 | Step 13217/ 21000 | Loss: nan | LR: 1.12e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 944 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 946/1500 [58:26<33:02, 3.58s/it]Epoch 945 | Step 13231/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 945 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 947/1500 [58:30<32:48, 3.56s/it]Epoch 946 | Step 13245/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 946 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 948/1500 [58:33<33:01, 3.59s/it]Epoch 947 | Step 13259/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 947 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 949/1500 [58:37<33:01, 3.60s/it]Epoch 948 | Step 13273/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 948 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 950/1500 [58:40<32:38, 3.56s/it]Epoch 949 | Step 13287/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 949 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 951/1500 [58:44<32:43, 3.58s/it]Epoch 950 | Step 13301/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 950 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 63%|βββββββ | 952/1500 [58:48<32:28, 3.56s/it]Epoch 951 | Step 13315/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 951 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 953/1500 [58:51<32:11, 3.53s/it]Epoch 952 | Step 13329/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 952 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 954/1500 [58:55<32:04, 3.52s/it]Epoch 953 | Step 13343/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 953 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 955/1500 [58:58<32:10, 3.54s/it]Epoch 954 | Step 13357/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 954 | Avg Loss: nan | LR: 1.11e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 956/1500 [59:02<31:41, 3.49s/it]Epoch 955 | Step 13371/ 21000 | Loss: nan | LR: 1.11e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 955 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 957/1500 [59:05<31:48, 3.51s/it]Epoch 956 | Step 13385/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 956 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 958/1500 [59:08<31:30, 3.49s/it]Epoch 957 | Step 13399/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 957 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 959/1500 [59:12<31:47, 3.53s/it]Epoch 958 | Step 13413/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 958 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 960/1500 [59:16<31:56, 3.55s/it]Epoch 959 | Step 13427/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 959 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 961/1500 [59:19<31:36, 3.52s/it]Epoch 960 | Step 13441/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 960 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 962/1500 [59:23<31:28, 3.51s/it]Epoch 961 | Step 13455/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 961 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 963/1500 [59:26<31:16, 3.49s/it]Epoch 962 | Step 13469/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 962 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 964/1500 [59:29<30:56, 3.46s/it]Epoch 963 | Step 13483/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 963 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 965/1500 [59:33<30:56, 3.47s/it]Epoch 964 | Step 13497/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 964 | Avg Loss: nan | LR: 1.10e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 966/1500 [59:36<30:58, 3.48s/it]Epoch 965 | Step 13511/ 21000 | Loss: nan | LR: 1.10e-03 | Speed: 3.8 steps/s | ETA: 0.6h |
| Epoch 965 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 64%|βββββββ | 967/1500 [59:40<31:06, 3.50s/it]Epoch 966 | Step 13525/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 966 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 968/1500 [59:44<31:12, 3.52s/it]Epoch 967 | Step 13539/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 967 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 969/1500 [59:47<31:27, 3.55s/it]Epoch 968 | Step 13553/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 968 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 970/1500 [59:51<31:15, 3.54s/it]Epoch 969 | Step 13567/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 969 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 971/1500 [59:54<30:59, 3.51s/it]Epoch 970 | Step 13581/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 970 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 972/1500 [59:58<30:55, 3.51s/it]Epoch 971 | Step 13595/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 971 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 973/1500 [1:00:01<30:49, 3.51s/it]Epoch 972 | Step 13609/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 972 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 974/1500 [1:00:05<30:56, 3.53s/it]Epoch 973 | Step 13623/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 973 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 975/1500 [1:00:08<30:43, 3.51s/it]Epoch 974 | Step 13637/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 974 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 976/1500 [1:00:12<30:21, 3.48s/it]Epoch 975 | Step 13651/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 975 | Avg Loss: nan | LR: 1.09e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 977/1500 [1:00:15<30:23, 3.49s/it]Epoch 976 | Step 13665/ 21000 | Loss: nan | LR: 1.09e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 976 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 978/1500 [1:00:19<30:20, 3.49s/it]Epoch 977 | Step 13679/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 977 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 979/1500 [1:00:22<30:23, 3.50s/it]Epoch 978 | Step 13693/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 978 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 980/1500 [1:00:26<29:57, 3.46s/it]Epoch 979 | Step 13707/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 979 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 981/1500 [1:00:29<29:57, 3.46s/it]Epoch 980 | Step 13721/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 980 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 65%|βββββββ | 982/1500 [1:00:32<29:48, 3.45s/it]Epoch 981 | Step 13735/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 981 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 983/1500 [1:00:36<29:50, 3.46s/it]Epoch 982 | Step 13749/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 982 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 984/1500 [1:00:39<29:48, 3.47s/it]Epoch 983 | Step 13763/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 983 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 985/1500 [1:00:43<29:34, 3.45s/it]Epoch 984 | Step 13777/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 984 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 986/1500 [1:00:46<29:56, 3.50s/it]Epoch 985 | Step 13791/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 985 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 987/1500 [1:00:50<29:36, 3.46s/it]Epoch 986 | Step 13805/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 986 | Avg Loss: nan | LR: 1.08e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 988/1500 [1:00:53<29:30, 3.46s/it]Epoch 987 | Step 13819/ 21000 | Loss: nan | LR: 1.08e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 987 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 989/1500 [1:00:57<29:29, 3.46s/it]Epoch 988 | Step 13833/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 988 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 990/1500 [1:01:00<29:38, 3.49s/it]Epoch 989 | Step 13847/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 989 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 991/1500 [1:01:04<29:31, 3.48s/it]Epoch 990 | Step 13861/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 990 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 992/1500 [1:01:07<29:19, 3.46s/it]Epoch 991 | Step 13875/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 991 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 993/1500 [1:01:11<29:23, 3.48s/it]Epoch 992 | Step 13889/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 992 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 994/1500 [1:01:14<29:46, 3.53s/it]Epoch 993 | Step 13903/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 993 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 995/1500 [1:01:18<29:28, 3.50s/it]Epoch 994 | Step 13917/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 994 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 996/1500 [1:01:21<29:23, 3.50s/it]Epoch 995 | Step 13931/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 995 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 66%|βββββββ | 997/1500 [1:01:25<28:51, 3.44s/it]Epoch 996 | Step 13945/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 996 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.3s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 998/1500 [1:01:28<29:03, 3.47s/it]Epoch 997 | Step 13959/ 21000 | Loss: nan | LR: 1.07e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 997 | Avg Loss: nan | LR: 1.07e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 999/1500 [1:01:32<29:43, 3.56s/it]Epoch 998 | Step 13973/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 998 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.8s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1000/1500 [1:01:35<29:46, 3.57s/it]Epoch 999 | Step 13987/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 999 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1001/1500 [1:01:39<29:39, 3.57s/it]Epoch 1000 | Step 14001/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1000 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1002/1500 [1:01:43<29:32, 3.56s/it]Epoch 1001 | Step 14015/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1001 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1003/1500 [1:01:46<29:09, 3.52s/it]Epoch 1002 | Step 14029/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1002 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1004/1500 [1:01:50<29:23, 3.56s/it]Epoch 1003 | Step 14043/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1003 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1005/1500 [1:01:53<29:00, 3.52s/it]Epoch 1004 | Step 14057/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1004 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1006/1500 [1:01:57<28:57, 3.52s/it]Epoch 1005 | Step 14071/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1005 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1007/1500 [1:02:00<28:56, 3.52s/it]Epoch 1006 | Step 14085/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1006 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1008/1500 [1:02:04<29:06, 3.55s/it]Epoch 1007 | Step 14099/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1007 | Avg Loss: nan | LR: 1.06e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1009/1500 [1:02:07<29:05, 3.55s/it]Epoch 1008 | Step 14113/ 21000 | Loss: nan | LR: 1.06e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1008 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1010/1500 [1:02:11<28:49, 3.53s/it]Epoch 1009 | Step 14127/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1009 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1011/1500 [1:02:14<28:29, 3.50s/it]Epoch 1010 | Step 14141/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1010 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 67%|βββββββ | 1012/1500 [1:02:18<28:27, 3.50s/it]Epoch 1011 | Step 14155/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1011 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1013/1500 [1:02:21<28:32, 3.52s/it]Epoch 1012 | Step 14169/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1012 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1014/1500 [1:02:25<28:24, 3.51s/it]Epoch 1013 | Step 14183/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1013 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1015/1500 [1:02:28<28:30, 3.53s/it]Epoch 1014 | Step 14197/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1014 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1016/1500 [1:02:32<28:04, 3.48s/it]Epoch 1015 | Step 14211/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1015 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1017/1500 [1:02:35<27:57, 3.47s/it]Epoch 1016 | Step 14225/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1016 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1018/1500 [1:02:39<28:09, 3.51s/it]Epoch 1017 | Step 14239/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1017 | Avg Loss: nan | LR: 1.05e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1019/1500 [1:02:43<29:36, 3.69s/it]Epoch 1018 | Step 14253/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1018 | Avg Loss: nan | LR: 1.05e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1020/1500 [1:02:46<29:02, 3.63s/it]Epoch 1019 | Step 14267/ 21000 | Loss: nan | LR: 1.05e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1019 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1021/1500 [1:02:50<28:23, 3.56s/it]Epoch 1020 | Step 14281/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1020 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1022/1500 [1:02:54<30:30, 3.83s/it]Epoch 1021 | Step 14295/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1021 | Avg Loss: nan | LR: 1.04e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1023/1500 [1:02:58<29:19, 3.69s/it]Epoch 1022 | Step 14309/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1022 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1024/1500 [1:03:01<29:03, 3.66s/it]Epoch 1023 | Step 14323/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1023 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1025/1500 [1:03:05<28:23, 3.59s/it]Epoch 1024 | Step 14337/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1024 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1026/1500 [1:03:08<28:05, 3.56s/it]Epoch 1025 | Step 14351/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1025 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 68%|βββββββ | 1027/1500 [1:03:11<27:42, 3.51s/it]Epoch 1026 | Step 14365/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1026 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.4s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1028/1500 [1:03:15<27:36, 3.51s/it]Epoch 1027 | Step 14379/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1027 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1029/1500 [1:03:19<27:51, 3.55s/it]Epoch 1028 | Step 14393/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1028 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1030/1500 [1:03:22<27:42, 3.54s/it]Epoch 1029 | Step 14407/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1029 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1031/1500 [1:03:26<27:39, 3.54s/it]Epoch 1030 | Step 14421/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1030 | Avg Loss: nan | LR: 1.04e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1032/1500 [1:03:29<27:33, 3.53s/it]Epoch 1031 | Step 14435/ 21000 | Loss: nan | LR: 1.04e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1031 | Avg Loss: nan | LR: 1.03e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1033/1500 [1:03:33<27:45, 3.57s/it]Epoch 1032 | Step 14449/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1032 | Avg Loss: nan | LR: 1.03e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1034/1500 [1:03:36<27:48, 3.58s/it]Epoch 1033 | Step 14463/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1033 | Avg Loss: nan | LR: 1.03e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1035/1500 [1:03:40<27:57, 3.61s/it]Epoch 1034 | Step 14477/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1034 | Avg Loss: nan | LR: 1.03e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1036/1500 [1:03:44<27:58, 3.62s/it]Epoch 1035 | Step 14491/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1035 | Avg Loss: nan | LR: 1.03e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1037/1500 [1:03:48<29:51, 3.87s/it]Epoch 1036 | Step 14505/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1036 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1038/1500 [1:03:53<31:06, 4.04s/it]Epoch 1037 | Step 14519/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1037 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1039/1500 [1:03:57<31:53, 4.15s/it]Epoch 1038 | Step 14533/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1038 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1040/1500 [1:04:01<32:11, 4.20s/it]Epoch 1039 | Step 14547/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1039 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1041/1500 [1:04:06<32:52, 4.30s/it]Epoch 1040 | Step 14561/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1040 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 69%|βββββββ | 1042/1500 [1:04:11<33:42, 4.42s/it]Epoch 1041 | Step 14575/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1041 | Avg Loss: nan | LR: 1.03e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1043/1500 [1:04:15<33:30, 4.40s/it]Epoch 1042 | Step 14589/ 21000 | Loss: nan | LR: 1.03e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1042 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1044/1500 [1:04:20<34:02, 4.48s/it]Epoch 1043 | Step 14603/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1043 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1045/1500 [1:04:24<33:56, 4.48s/it]Epoch 1044 | Step 14617/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1044 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1046/1500 [1:04:29<34:00, 4.50s/it]Epoch 1045 | Step 14631/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1045 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1047/1500 [1:04:33<34:09, 4.52s/it]Epoch 1046 | Step 14645/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1046 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1048/1500 [1:04:38<33:50, 4.49s/it]Epoch 1047 | Step 14659/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1047 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1049/1500 [1:04:42<33:28, 4.45s/it]Epoch 1048 | Step 14673/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1048 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1050/1500 [1:04:46<33:15, 4.43s/it]Epoch 1049 | Step 14687/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1049 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1051/1500 [1:04:51<33:33, 4.49s/it]Epoch 1050 | Step 14701/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1050 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1052/1500 [1:04:55<33:30, 4.49s/it]Epoch 1051 | Step 14715/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1051 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1053/1500 [1:05:00<33:42, 4.52s/it]Epoch 1052 | Step 14729/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1052 | Avg Loss: nan | LR: 1.02e-03 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1054/1500 [1:05:05<33:29, 4.51s/it]Epoch 1053 | Step 14743/ 21000 | Loss: nan | LR: 1.02e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1053 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1055/1500 [1:05:09<33:17, 4.49s/it]Epoch 1054 | Step 14757/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1054 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1056/1500 [1:05:13<32:49, 4.43s/it]Epoch 1055 | Step 14771/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1055 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 70%|βββββββ | 1057/1500 [1:05:18<32:45, 4.44s/it]Epoch 1056 | Step 14785/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1056 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1058/1500 [1:05:22<33:12, 4.51s/it]Epoch 1057 | Step 14799/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1057 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.7s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1059/1500 [1:05:27<32:47, 4.46s/it]Epoch 1058 | Step 14813/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1058 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1060/1500 [1:05:32<33:21, 4.55s/it]Epoch 1059 | Step 14827/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1059 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1061/1500 [1:05:36<33:01, 4.51s/it]Epoch 1060 | Step 14841/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1060 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1062/1500 [1:05:40<32:36, 4.47s/it]Epoch 1061 | Step 14855/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1061 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.4s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1063/1500 [1:05:45<32:11, 4.42s/it]Epoch 1062 | Step 14869/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1062 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1064/1500 [1:05:49<31:55, 4.39s/it]Epoch 1063 | Step 14883/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.5h |
| Epoch 1063 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.3s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1065/1500 [1:05:53<31:09, 4.30s/it]Epoch 1064 | Step 14897/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1064 | Avg Loss: nan | LR: 1.01e-03 | Time: 4.1s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1066/1500 [1:05:56<29:20, 4.06s/it]Epoch 1065 | Step 14911/ 21000 | Loss: nan | LR: 1.01e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1065 | Avg Loss: nan | LR: 1.00e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1067/1500 [1:06:00<28:09, 3.90s/it]Epoch 1066 | Step 14925/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1066 | Avg Loss: nan | LR: 1.00e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 71%|βββββββ | 1068/1500 [1:06:04<27:18, 3.79s/it]Epoch 1067 | Step 14939/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1067 | Avg Loss: nan | LR: 1.00e-03 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 71%|ββββββββ | 1069/1500 [1:06:07<27:03, 3.77s/it]Epoch 1068 | Step 14953/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1068 | Avg Loss: nan | LR: 1.00e-03 | Time: 3.7s | Samples: 6,983 |
|
Training Flow Model: 71%|ββββββββ | 1070/1500 [1:06:11<26:33, 3.71s/it]Epoch 1069 | Step 14967/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1069 | Avg Loss: nan | LR: 1.00e-03 | Time: 3.6s | Samples: 6,983 |
|
Training Flow Model: 71%|ββββββββ | 1071/1500 [1:06:16<28:55, 4.05s/it]Epoch 1070 | Step 14981/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1070 | Avg Loss: nan | LR: 1.00e-03 | Time: 4.8s | Samples: 6,983 |
| /data2/edwardsun/flow_home/cfg_dataset.py:360: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). |
| 'index': torch.tensor(idx, dtype=torch.long) |
| /data2/edwardsun/flow_home/cfg_dataset.py:360: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor). |
| 'index': torch.tensor(idx, dtype=torch.long) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 71%|ββββββββ | 1072/1500 [1:06:22<32:52, 4.61s/it]Epoch 1071 | Step 14995/ 21000 | Loss: nan | LR: 1.00e-03 | Speed: 3.8 steps/s | ETA: 0.4h |
| Validation at step 15000: Loss = nan |
| Epoch 1071 | Avg Loss: nan | LR: 1.00e-03 | Time: 5.9s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1073/1500 [1:06:26<32:32, 4.57s/it]Epoch 1072 | Step 15009/ 21000 | Loss: nan | LR: 9.99e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1072 | Avg Loss: nan | LR: 9.99e-04 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1074/1500 [1:06:31<33:15, 4.68s/it]Epoch 1073 | Step 15023/ 21000 | Loss: nan | LR: 9.99e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1073 | Avg Loss: nan | LR: 9.98e-04 | Time: 4.9s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1075/1500 [1:06:36<33:22, 4.71s/it]Epoch 1074 | Step 15037/ 21000 | Loss: nan | LR: 9.98e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1074 | Avg Loss: nan | LR: 9.97e-04 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1076/1500 [1:06:40<32:49, 4.65s/it]Epoch 1075 | Step 15051/ 21000 | Loss: nan | LR: 9.97e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1075 | Avg Loss: nan | LR: 9.96e-04 | Time: 4.5s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1077/1500 [1:06:45<32:34, 4.62s/it]Epoch 1076 | Step 15065/ 21000 | Loss: nan | LR: 9.96e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1076 | Avg Loss: nan | LR: 9.95e-04 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1078/1500 [1:06:50<32:58, 4.69s/it]Epoch 1077 | Step 15079/ 21000 | Loss: nan | LR: 9.95e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1077 | Avg Loss: nan | LR: 9.94e-04 | Time: 4.8s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1079/1500 [1:06:54<32:45, 4.67s/it]Epoch 1078 | Step 15093/ 21000 | Loss: nan | LR: 9.94e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1078 | Avg Loss: nan | LR: 9.94e-04 | Time: 4.6s | Samples: 6,983 |
|
Training Flow Model: 72%|ββββββββ | 1080/1500 [1:06:59<32:25, 4.63s/it]Epoch 1079 | Step 15107/ 21000 | Loss: nan | LR: 9.94e-04 | Speed: 3.8 steps/s | ETA: 0.4h |
| Epoch 1079 | Avg Loss: nan | LR: 9.93e-04 | Time: 4.6s | Samples: 6,983 |
|
|