PencilFolder / log /Wan2.1-1.3b-mc-lora.out
PencilHu's picture
Upload folder using huggingface_hub
1146a67 verified
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
The following values were not passed to `accelerate launch` and had defaults used instead:
`--num_processes` was set to a value of `6`
More than one GPU was found, enabling multi-GPU training.
If this was unintended please pass in `--num_processes=1`.
`--num_machines` was set to a value of `1`
`--mixed_precision` was set to a value of `'no'`
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
import pynvml # type: ignore[import]
wandb: Currently logged in as: 850587960 (850587960-tsinghua-university) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
wandb: setting up run 8hhtbq9v
wandb: Tracking run with wandb version 0.23.1
wandb: Run data is saved locally in /data/rczhang/PencilFolder/DiffSynth-Studio/wandb/run-20251218_060313-8hhtbq9v
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run Wan2.1-1.3b-mc-lora
wandb: ⭐️ View project at https://wandb.ai/850587960-tsinghua-university/WanLoRA-Diffsyn
wandb: 🚀 View run at https://wandb.ai/850587960-tsinghua-university/WanLoRA-Diffsyn/runs/8hhtbq9v
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors"
Loaded model: {
"model_name": "wan_video_dit",
"model_class": "diffsynth.models.wan_video_dit.WanModel",
"extra_kwargs": {
"has_image_input": false,
"patch_size": [
1,
2,
2
],
"in_dim": 16,
"dim": 1536,
"ffn_dim": 8960,
"freq_dim": 256,
"text_dim": 4096,
"out_dim": 16,
"num_heads": 12,
"num_layers": 30,
"eps": 1e-06
}
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth"
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Detected non-safetensors files, which may cause slower loading. It's recommended to convert it to a safetensors file.
Loaded model: {
"model_name": "wan_video_text_encoder",
"model_class": "diffsynth.models.wan_video_text_encoder.WanTextEncoder",
"extra_kwargs": null
}
Loading models from: "/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth"
[rank1]: Traceback (most recent call last):
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/examples/wanvideo/model_training/train_mc_lora.py", line 215, in <module>
[rank1]: model = WanTrainingModule(
[rank1]: ^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/examples/wanvideo/model_training/train_mc_lora.py", line 50, in __init__
[rank1]: self.pipe = WanVideoPipeline.from_pretrained(
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/pipelines/wan_video.py", line 130, in from_pretrained
[rank1]: model_pool = pipe.download_and_load_models(model_configs, vram_limit)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/diffusion/base_pipeline.py", line 287, in download_and_load_models
[rank1]: model_pool.auto_load_model(
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/models/model_loader.py", line 66, in auto_load_model
[rank1]: model_hash = hash_model_file(path)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 118, in hash_model_file
[rank1]: keys_dict = load_keys_dict(path)
[rank1]: ^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 74, in load_keys_dict
[rank1]: return load_keys_dict_from_bin(file_path)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 96, in load_keys_dict_from_bin
[rank1]: state_dict = load_state_dict_from_bin(file_path)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/data/rczhang/PencilFolder/DiffSynth-Studio/diffsynth/core/loader/file.py", line 28, in load_state_dict_from_bin
[rank1]: state_dict = torch.load(file_path, map_location=device, weights_only=True)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 1484, in load
[rank1]: with _open_file_like(f, "rb") as opened_file:
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 759, in _open_file_like
[rank1]: return _open_file(name_or_buffer, mode)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/serialization.py", line 740, in __init__
[rank1]: super().__init__(open(name, mode))
[rank1]: ^^^^^^^^^^^^^^^^
[rank1]: FileNotFoundError: [Errno 2] No such file or directory: '/data/rczhang/PencilFolder/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth'
W1218 06:03:36.164000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434589 closing signal SIGTERM
W1218 06:03:36.165000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434607 closing signal SIGTERM
W1218 06:03:36.165000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434608 closing signal SIGTERM
W1218 06:03:36.166000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434609 closing signal SIGTERM
W1218 06:03:36.166000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 2434610 closing signal SIGTERM
E1218 06:03:37.036000 2434465 site-packages/torch/distributed/elastic/multiprocessing/api.py:882] failed (exitcode: 1) local_rank: 1 (pid: 2434590) of binary: /home/rczhang/miniconda3/envs/diffsyn/bin/python3.12
Traceback (most recent call last):
File "/home/rczhang/miniconda3/envs/diffsyn/bin/accelerate", line 7, in <module>
sys.exit(main())
^^^^^^
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
args.func(args)
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/launch.py", line 1272, in launch_command
multi_gpu_launcher(args)
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/accelerate/commands/launch.py", line 899, in multi_gpu_launcher
distrib_run.run(args)
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/run.py", line 927, in run
elastic_launch(
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 156, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rczhang/miniconda3/envs/diffsyn/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 293, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
examples/wanvideo/model_training/train_mc_lora.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2025-12-18_06:03:36
host : bm-9103581
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 2434590)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================