Some problems

#1
by bdsqlsz - opened
  1. This project does not support flashattn2; we hope to receive support for it.
  2. Currently, it seems there are compatibility issues with transformers > 5; it's best to limit it to < 5.
bdsqlsz changed discussion status to closed
  1. Thanks for raising this. Our project does support FlashAttention 2 β€” we use it for both training and evaluation internally.
    If the issue you are seeing is about inference speed, we are currently planning to release a vLLM-based version, which should help with inference performance.

  2. Thanks for the feedback. We use transformers==4.57 for training. We have not tested the project with Transformers 5.x yet, so there may indeed be some compatibility issues with versions above 5.

  1. Thanks for raising this. Our project does support FlashAttention 2 β€” we use it for both training and evaluation internally.
    If the issue you are seeing is about inference speed, we are currently planning to release a vLLM-based version, which should help with inference performance.

  2. Thanks for the feedback. We use transformers==4.57 for training. We have not tested the project with Transformers 5.x yet, so there may indeed be some compatibility issues with versions above 5.

Thanks for your reply. The problem I encountered was that it doesn't support the transformer format fa2.
model = AutoModelForCausalLM.from_pretrained(
"cslys1999/Eureka-Audio-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
attn_impl="flash_attention_2"
)
The usual usage, I just discovered, is that this project hard-coded to use...

Sign up or log in to comment