# Multi View Multi-view face as condition for human-centric video generation. ## ✔️ TODO List - [x] Base code for single-shot video generation - [x] Spliting RoPE for video and reference images - [ ] Face selecting router - [ ] Supporting multi-shot video generation - [ ] Four level shot RoPE - [ ] Inter-shot self-attention and frame-pack based intra-shot attention ## 🚀 Training ```bash bash train.sh ``` ## 🚀 Inference ```bash bash test.sh ``` ## ⚙️ Configuration ```bash YAML: train_args: max_checkpoints_to_keep: 3 resume_from_checkpoint: True seed: 42 save_steps: 150 save_epoches: 1 batch_size: 8 visual_log_project_name: Wan2.2_5B-Multi_view-normal_rope_384_640-3ref output_path: /root/paddlejob/workspace/qizipeng/baidu/personal-code/Multi-view/multi_view/ckpts local_model_path: /root/paddlejob/workspace/qizipeng/wanx_pretrainedmodels zero_face_ratio: 0.1 split_rope: False split1: False split2: False split3: False infer_args: infer_step: 1350 epoch_id: 17 dataset_args: base_path: /root/paddlejob/workspace/qizipeng/baidu/personal-code/Multi-view/multi_view/datasets/merged_wangpan_artgrid_taobao_visionchina_123rf_nasuyun_xinpianchang_disk.json height: 384 width: 640 num_frames: 81 ref_num: 3 ``` ---