nohup: ignoring input bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by bash) W0211 11:26:05.180000 2473 site-packages/torch/distributed/run.py:803] W0211 11:26:05.180000 2473 site-packages/torch/distributed/run.py:803] ***************************************** W0211 11:26:05.180000 2473 site-packages/torch/distributed/run.py:803] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. W0211 11:26:05.180000 2473 site-packages/torch/distributed/run.py:803] ***************************************** /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set draft model tie_word_embeddings to False /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set draft model tie_word_embeddings to False /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set draft model tie_word_embeddings to False /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set draft model tie_word_embeddings to False /workspace/specforge/lib/python3.11/site-packages/tvm_ffi/_optional_torch_c_dlpack.py:174: UserWarning: Failed to JIT torch c dlpack extension, EnvTensorAllocator will not be enabled. We recommend installing via `pip install torch-c-dlpack-ext` warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/SpecForge-ext/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set draft model tie_word_embeddings to False Set draft model tie_word_embeddings to False Set draft model tie_word_embeddings to False -----------【args】----------- target_model_path /workspace/Qwen3-8B trust_remote_code ······· False draft_model_config /workspace/hanrui/SpecForge-ext/configs/qwen3-8b-qwen3eagle-5layer.json embedding_key model.embed_tokens.weight lm_head_key lm_head.weight is_vlm ······· False target_model_backend ······ sglang train_data_path /workspace/hanrui/qwen3-8b_dflash_regen/sharegpt_train_regenerated.jsonl train_hidden_states_path ········ None eval_hidden_states_path ········ None eval_data_path ········ None chat_template ········ qwen is_preformatted ······· False train_only_last_turn ······· False build_dataset_num_proc ··········· 8 dataloader_num_workers ··········· 4 num_epochs ·········· 10 max_num_steps ········ None batch_size ··········· 8 learning_rate ······ 0.0001 max_length ········ 2048 warmup_ratio ······· 0.015 total_steps ········ None max_grad_norm ········· 0.5 ttt_length ··········· 7 resume ······· False ckpt_dir ········ None eval_interval ········ 5000 save_interval ········ 5000 log_interval ········· 100 seed ··········· 0 draft_accumulation_steps ··········· 1 tp_size ··········· 1 sp_ulysses_size ··········· 1 sp_ring_size ··········· 1 attention_backend flex_attention cache_key ········ None cache_dir /workspace/hanrui/SpecForge-ext/cache output_dir /workspace/hanrui/SpecForge-ext/outputs/qwen3-8b-qwen3eagle-5layer verbose ······· False dist_timeout ·········· 20 model_download_dir ········ None min_pixels ······· 50176 max_pixels ······ 802816 profile ······· False profile_start_step ·········· 30 profile_num_steps ··········· 4 profile_record_shapes ······· False sglang_attention_backend ·· flashinfer sglang_mem_fraction_static ········· 0.4 sglang_context_length ········ None sglang_enable_nccl_nvls ······· False sglang_enable_symm_mem ······· False sglang_enable_torch_compile ······· False sglang_enable_dp_attention ······· False sglang_enable_dp_lm_head ······· False sglang_enable_piecewise_cuda_graph ······· False sglang_piecewise_cuda_graph_max_tokens ········ 4096 sglang_piecewise_cuda_graph_tokens ········ None sglang_ep_size ··········· 1 report_to ········ none wandb_project ········ None wandb_name ········ None wandb_key ········ None swanlab_project ········ None swanlab_name ········ None swanlab_key ········ None mlflow_tracking_uri ········ None mlflow_experiment_name ········ None mlflow_run_name ········ None dp_size ··········· 8 target_batch_size ··········· 8 Set draft model tie_word_embeddings to False [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 [Gloo] Rank 0 is connected to 0 peer ranks. Expected number of connected peer ranks is : 0 WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined /bin/bash: /workspace/specforge/lib/libtinfo.so.6: no version information available (required by /bin/bash) WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined WARNING:sglang.srt.models.registry:Ignore import error when loading sglang.srt.models.mindspore: name 'ms' is not defined Loading safetensors checkpoint shards: 0% Completed | 0/5 [00:00