There is a conflict between autoawq and the transformers library of Qwen with this version
#5
by xt2019 - opened
xt2019 changed discussion title from autoawq is to There is a conflict between autoawq and the transformers library of Qwen with this version
I managed to get it running with autoawq-0.2.7.post3 and transformers-4.51.0.dev0
there is another problem when enabling flash_attention_2,there is another problem:
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != c10::BFloat16。
You can try this, it runs successfully on my machine.
pip install transformers==4.49.0
I didn't encounter this problem when running with the vllm framework
xt2019 changed discussion status to closed
