| 11/29/2025 14:21:48 - INFO - __main__ - Distributed environment: DistributedType.NO |
| Num processes: 1 |
| Process index: 0 |
| Local process index: 0 |
| Device: cuda |
|
|
| Mixed precision type: fp16 |
|
|
| 11/29/2025 14:21:48 - INFO - __main__ - Starting script: train_controlnet.py |
| 11/29/2025 14:21:50 - INFO - __main__ - Initializing controlnet weights from unet |
| 11/29/2025 14:21:52 - INFO - __main__ - Training Arguments: |
| pretrained_model_name_or_path: stable-diffusion-v1-5/stable-diffusion-v1-5 |
| controlnet_model_name_or_path: None |
| revision: None |
| variant: None |
| trust_remote_code: False |
| dataset_name_or_path: /home/23132798r/workspace/tmp-smoke/data/controlnet |
| dataset_config_name: None |
| image_column: image |
| conditioning_image_column: conditioning_image |
| caption_column: text |
| resolution: 512 |
| center_crop: False |
| random_flip: False |
| validation_ids: [1500, 5500, 8500] |
| validation_steps: 10000 |
| output_dir: ./output-controlnet |
| cache_dir: None |
| logging_dir: logs |
| tracker_project_name: controlnet-training |
| checkpointing_steps: None |
| checkpoints_total_limit: None |
| resume_from_checkpoint: None |
| report_to: tensorboard |
| seed: 42 |
| train_batch_size: 16 |
| num_train_epochs: 3 |
| max_train_steps: None |
| gradient_accumulation_steps: 1 |
| gradient_checkpointing: False |
| dataloader_num_workers: 8 |
| noise_offset: 0.1 |
| prediction_type: None |
| adam_beta1: 0.9 |
| adam_beta2: 0.999 |
| adam_weight_decay: 0.01 |
| adam_epsilon: 1e-08 |
| max_grad_norm: 1.0 |
| learning_rate: 1e-05 |
| scale_lr: False |
| lr_scheduler: constant |
| lr_warmup_steps: 0 |
| mixed_precision: fp16 |
| use_8bit_adam: False |
| allow_tf32: False |
| enable_xformers_memory_efficient_attention: False |
| local_rank: -1 |
|
|
| 11/29/2025 14:21:52 - INFO - __main__ - ControlNet Model Config: |
| FrozenDict({'in_channels': 4, 'conditioning_channels': 3, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'DownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 768, 'transformer_layers_per_block': 1, 'encoder_hid_dim': None, 'encoder_hid_dim_type': None, 'attention_head_dim': 8, 'num_attention_heads': None, 'use_linear_projection': False, 'class_embed_type': None, 'addition_embed_type': None, 'addition_time_embed_dim': None, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'projection_class_embeddings_input_dim': None, 'controlnet_conditioning_channel_order': 'rgb', 'conditioning_embedding_out_channels': (16, 32, 96, 256), 'global_pool_conditions': False, 'addition_embed_type_num_heads': 64, '_use_default_values': ['global_pool_conditions', 'addition_embed_type_num_heads']}) |
| 11/29/2025 14:21:54 - INFO - __main__ - ============ Training Begins ============ |
| 11/29/2025 14:21:54 - INFO - __main__ - Num Epochs = 3 |
| 11/29/2025 14:21:54 - INFO - __main__ - Instantaneous batch size per device = 16 |
| 11/29/2025 14:21:54 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 16 |
| 11/29/2025 14:21:54 - INFO - __main__ - Gradient Accumulation steps = 1 |
| 11/29/2025 14:21:54 - INFO - __main__ - Total optimization steps = 45000 |
| 11/29/2025 16:56:53 - INFO - __main__ - Running validation... |
| 11/29/2025 18:15:04 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-15000 |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-15000/optimizer.bin |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-15000/scheduler.bin |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-15000/sampler.bin |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-15000/sampler_1.bin |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-15000/scaler.pt |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-15000/random_states_0.pkl |
| 11/29/2025 18:15:11 - INFO - __main__ - Saved state to output-controlnet/checkpoint-15000 |
| 11/29/2025 18:15:12 - INFO - __main__ - Epoch 0 | Global Step 15000 |
| 11/29/2025 19:32:33 - INFO - __main__ - Running validation... |
| 11/29/2025 22:08:22 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-30000 |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-30000/optimizer.bin |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-30000/scheduler.bin |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-30000/sampler.bin |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-30000/sampler_1.bin |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-30000/scaler.pt |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-30000/random_states_0.pkl |
| 11/29/2025 22:08:29 - INFO - __main__ - Saved state to output-controlnet/checkpoint-30000 |
| 11/29/2025 22:08:29 - INFO - __main__ - Running validation... |
| 11/29/2025 22:08:32 - INFO - __main__ - Epoch 1 | Global Step 30000 |
| 11/30/2025 00:43:43 - INFO - __main__ - Running validation... |
| 11/30/2025 02:01:21 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-45000 |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-45000/optimizer.bin |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-45000/scheduler.bin |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-45000/sampler.bin |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-45000/sampler_1.bin |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-45000/scaler.pt |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-45000/random_states_0.pkl |
| 11/30/2025 02:01:28 - INFO - __main__ - Saved state to output-controlnet/checkpoint-45000 |
| 11/30/2025 02:01:28 - INFO - __main__ - Epoch 2 | Global Step 45000 |
| 11/30/2025 02:01:34 - INFO - __main__ - Running validation... |
| 11/30/2025 02:01:37 - INFO - __main__ - Finished! |
|
|