Why model dtype in config is fp32?
#1
by Rayzl - opened
I infer the model with transformers Qwen2_5_VLForConditionalGeneration,
model = Qwen2_5_VLForConditionalGeneration.from_pretrained("Xiaomi-MiMo-VL-Miloco-7B", dtype="auto").to("cuda")
when dtype set to torch.bfloat16, the model think not stop. dtyoe set auto work well.