Automatic Speech Recognition
Transformers
Safetensors
meralion2
meralion
meralion-2
custom_code

Remove flash-attn from requirements and GPU inference example

#2
by YingxuHe - opened

Remove flash-attn as a required dependency and remove attn_implementation="flash_attention_2" from the GPU inference example.

The model works with PyTorch's built-in SDPA attention which is auto-selected by transformers when flash-attn is not installed.

YingxuHe changed pull request status to merged

Sign up or log in to comment