/home/v-menggao/miniconda3/envs/VLMtoVec/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'repr' attribute with value False was provided to the `Field()` function, which has no effect in the context it was used. 'repr' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type. warnings.warn( /home/v-menggao/miniconda3/envs/VLMtoVec/lib/python3.10/site-packages/pydantic/_internal/_generate_schema.py:2249: UnsupportedFieldAttributeWarning: The 'frozen' attribute with value True was provided to the `Field()` function, which has no effect in the context it was used. 'frozen' is field-specific metadata, and can only be attached to a model field using `Annotated` metadata or by assignment. This may have happened because an `Annotated` type alias using the `type` statement was used, or if the `Field()` function was attached to a single member of a union type. warnings.warn( [2026-01-02 19:23:26,531] DEBUG [git.cmd:1270] Popen(['git', 'version'], cwd=/home/v-menggao/code/VLM2Vec, stdin=None, shell=False, universal_newlines=False) [2026-01-02 19:23:26,533] DEBUG [git.cmd:1270] Popen(['git', 'version'], cwd=/home/v-menggao/code/VLM2Vec, stdin=None, shell=False, universal_newlines=False) /home/v-menggao/miniconda3/envs/VLMtoVec/lib/python3.10/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning) DropoutAddRMSNorm of flash_attn is not installed!!! /home/v-menggao/code/VLM2Vec/src/model/baseline_backbone/internvideo2/modeling_internvideo2.py:539: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead. @torch.cuda.amp.autocast(enabled=False) [2026-01-02 19:23:27,382] INFO [src.utils:11] rank0: Forced ddp_find_unused_parameters to: True [2026-01-02 19:23:27,382] INFO [src.utils:11] rank0: === Early Exit Classifier Training === [2026-01-02 19:23:27,382] INFO [src.utils:11] rank0: Target Layer: 12 [2026-01-02 19:23:27,382] INFO [src.utils:11] rank0: Distributed Training: Rank 0 [2026-01-02 19:23:27,382] INFO [src.utils:11] rank0: Loading Backbone Model... [2026-01-02 19:23:27,645] INFO [src.utils:20] Loading backbone [qwen2_5_vl] from Qwen/Qwen2.5-VL-7B-Instruct You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`. Loading checkpoint shards: 0%| | 0/5 [00:00