/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `6`
		More than one GPU was found, enabling multi-GPU training.
		If this was unintended please pass in `--num_processes=1`.
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
  import pynvml  # type: ignore[import]
wandb: Currently logged in as: 850587960 (850587960-tsinghua-university) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: setting up run 8hhtbq9v
wandb: Tracking run with wandb version 0.23.1
wandb: Run data is saved locally in /data/rczhang/PencilFolder/DiffSynth-Studio/wandb/run-20251218_060313-8hhtbq9v
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run Wan2.1-1.3b-mc-lora
wandb: ⭐️ View project at https://wandb.ai/850587960-tsinghua-university/WanLoRA-Diffsyn
wandb: 🚀 View run at https://wandb.ai/850587960-tsinghua-university/WanLoRA-Diffsyn/runs/8hhtbq9v
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loaded model: {
    "model_name": "wan_video_dit",
    "model_class": "diffsynth.models.wan_video_dit.WanModel",
    "extra_kwargs": {
        "has_image_input": false,
        "patch_size": [
            1,
            2,
            2
        ],
        "in_dim": 16,
        "dim": 1536,
        "ffn_dim": 8960,
        "freq_dim": 256,
        "text_dim": 4096,
        "out_dim": 16,
        "num_heads": 12,
        "num_layers": 30,
        "eps": 1e-06
    }
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Loaded model: {
    "model_name": "wan_video_text_encoder",
    "model_class": "diffsynth.models.wan_video_text_encoder.WanTextEncoder",
    "extra_kwargs": null
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth"
[rank1]: Traceback (most recent call last):
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/examples/wanvideo/model_training/train_mc_lora.py", line 215, in <module>
[rank1]:     model = WanTrainingModule(
[rank1]:             ^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/examples/wanvideo/model_training/train_mc_lora.py", line 50, in __init__
[rank1]:     self.pipe = WanVideoPipeline.from_pretrained(
[rank1]:                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/pipelines/wan_video.py", line 130, in from_pretrained
[rank1]:     model_pool = pipe.download_and_load_models(model_configs, vram_limit)
[rank1]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/diffusion/base_pipeline.py", line 287, in download_and_load_models
[rank1]:     model_pool.auto_load_model(
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/models/model_loader.py", line 66, in auto_load_model
[rank1]:     model_hash = hash_model_file(path)
[rank1]:                  ^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 118, in hash_model_file
[rank1]:     keys_dict = load_keys_dict(path)
[rank1]:                 ^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 74, in load_keys_dict
[rank1]:     return load_keys_dict_from_bin(file_path)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 96, in load_keys_dict_from_bin
[rank1]:     state_dict = load_state_dict_from_bin(file_path)
[rank1]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 28, in load_state_dict_from_bin
[rank1]:     state_dict = torch.load(file_path, map_location=device, weights_only=True)
[rank1]:                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 1484, in load
[rank1]:     with _open_file_like(f, "rb") as opened_file:
[rank1]:          ^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 759, in _open_file_like
[rank1]:     return _open_file(name_or_buffer, mode)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 740, in __init__
[rank1]:     super().__init__(open(name, mode))
[rank1]:                      ^^^^^^^^^^^^^^^^
[rank1]: FileNotFoundError: [Errno 2] No such file or directory: '/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth'
W1218 06:03:36.164000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434589 closing signal SIGTERM
W1218 06:03:36.165000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434607 closing signal SIGTERM
W1218 06:03:36.165000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434608 closing signal SIGTERM
W1218 06:03:36.166000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434609 closing signal SIGTERM
W1218 06:03:36.166000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434610 closing signal SIGTERM
E1218 06:03:37.036000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: 1) local_rank: 1 (pid: 2434590) of binary: /home/rczhang/miniconda3/envs/diffsyn/bin/python3.12
Traceback (most recent call last):
  File "/home/rczhang/miniconda3/envs/diffsyn/bin/accelerate", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
    args.func(args)
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/launch.py", line 1272, in launch_command
    multi_gpu_launcher(args)
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/launch.py", line 899, in multi_gpu_launcher
    distrib_run.run(args)
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/run.py", line 927, in run
    elastic_launch(
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 156, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 293, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
examples/wanvideo/model_training/train_mc_lora.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2025-12-18_06:03:36
  host      : bm-9103581
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 2434590)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================