Why model dtype in config is fp32?

#1
by Rayzl - opened

I infer the model with transformers Qwen2_5_VLForConditionalGeneration,

model = Qwen2_5_VLForConditionalGeneration.from_pretrained("Xiaomi-MiMo-VL-Miloco-7B", dtype="auto").to("cuda")

when dtype set to torch.bfloat16, the model think not stop. dtyoe set auto work well.

Sign up or log in to comment