| nohup: ignoring input |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:70: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead. |
| self.scaler = GradScaler() |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:116: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.embeddings = torch.load(combined_path, map_location=self.device) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:180: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.compressor.load_state_dict(torch.load('final_compressor_model.pth', map_location=self.device)) |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:181: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.decompressor.load_state_dict(torch.load('final_decompressor_model.pth', map_location=self.device)) |
| /data2/edwardsun/flow_home/cfg_dataset.py:253: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https: |
| self.embeddings = torch.load(combined_path, map_location='cpu') |
| Starting optimized training with batch_size=384, epochs=2000 |
| Using GPU 0 for optimized H100 training |
| Mixed precision: True |
| Batch size: 384 |
| Target epochs: 2000 |
| Learning rate: 0.0012 -> 0.0006 |
| β Mixed precision training enabled (BF16) |
| Loading ALL AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
| Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt... |
| β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
| Computing preprocessing statistics... |
| β Statistics computed and saved: |
| Total embeddings: 17,968 |
| Mean: -0.0005 Β± 0.0897 |
| Std: 0.0869 Β± 0.1168 |
| Range: [-9.1738, 3.2894] |
| Initializing models... |
| β Model compiled with torch.compile for speedup |
| β Models initialized: |
| Compressor parameters: 78,817,360 |
| Decompressor parameters: 39,458,720 |
| Flow model parameters: 50,779,584 |
| Initializing datasets with FULL data... |
| Loading AMP embeddings from /data2/edwardsun/flow_project/peptide_embeddings/... |
| Loading combined embeddings from /data2/edwardsun/flow_project/peptide_embeddings/all_peptide_embeddings.pt (FULL DATA)... |
| β Loaded ALL embeddings: torch.Size([17968, 50, 1280]) |
| Loading CFG data from FASTA: /home/edwardsun/flow/combined_final.fasta... |
| Parsing FASTA file: /home/edwardsun/flow/combined_final.fasta |
| Label assignment: >AP = AMP (0), >sp = Non-AMP (1) |
| β Parsed 6983 valid sequences from FASTA |
| AMP sequences: 3306 |
| Non-AMP sequences: 3677 |
| Masked for CFG: 698 |
| Loaded 6983 CFG sequences |
| Label distribution: [3306 3677] |
| Masked 698 labels for CFG training |
| Aligning AMP embeddings with CFG data... |
| Aligned 6983 samples |
| CFG Flow Dataset initialized: |
| AMP embeddings: torch.Size([17968, 50, 1280]) |
| CFG labels: 6983 |
| Aligned samples: 6983 |
| β Dataset initialized with FULL data: |
| Total samples: 6,983 |
| Batch size: 384 |
| Batches per epoch: 19 |
| Total training steps: 38,000 |
| Validation every: 10,000 steps |
| Initializing optimizer and scheduler... |
| β Optimizer initialized: |
| Base LR: 0.0012 |
| Min LR: 0.0006 |
| Warmup steps: 5000 |
| Weight decay: 0.01 |
| Gradient clip norm: 1.0 |
| β Optimized Single GPU training setup complete with FULL DATA! |
| π Starting Optimized Single GPU Flow Matching Training with FULL DATA |
| GPU: 0 |
| Total iterations: 2000 |
| Batch size: 384 |
| Total samples: 6,983 |
| Mixed precision: True |
| Estimated time: ~8-10 hours (overnight training with ALL data) |
| ============================================================ |
|
Training Flow Model: 0%| | 0/2000 [00:00<?, ?it/s]/data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 1/2000 [00:49<27:39:59, 49.82s/it]Epoch 0 | Step 1/ 38000 | Loss: 2.290177 | LR: 1.20e-04 | Speed: 0.0 steps/s | ETA: 376.4h |
| Epoch 0 | Avg Loss: 1.109821 | LR: 1.24e-04 | Time: 49.8s | Samples: 6,983 |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 2/2000 [00:55<13:22:45, 24.11s/it]Epoch 1 | Step 20/ 38000 | Loss: 1.010002 | LR: 1.24e-04 | Speed: 0.4 steps/s | ETA: 27.0h |
| Epoch 1 | Avg Loss: 1.002409 | LR: 1.28e-04 | Time: 6.1s | Samples: 6,983 |
| /data2/edwardsun/flow_home/amp_flow_training_single_gpu_full_data.py:392: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. |
| with autocast(dtype=torch.bfloat16): |
|
Training Flow Model: 0%| | 3/2000 [00:59<8:09:52, 14.72s/it] Epoch 2 | Step 39/ 38000 | Loss: 0.998573 | LR: 1.28e-04 | Speed: 0.7 steps/s | ETA: 15.4h |
| Epoch 2 | Avg Loss: 0.910289 | LR: 1.32e-04 | Time: 3.5s | Samples: 6,983 |
|
Training Flow Model: 0%| | 4/2000 [01:02<5:42:30, 10.30s/it]Epoch 3 | Step 58/ 38000 | Loss: 0.787784 | LR: 1.33e-04 | Speed: 1.0 steps/s | ETA: 11.1h |
| Epoch 3 | Avg Loss: 0.644033 | LR: 1.36e-04 | Time: 3.5s | Samples: 6,983 |
|
|