Automatic Speech Recognition
Transformers
Safetensors
meralion2
meralion
meralion-2
custom_code

Remove flash-attn from requirements and GPU inference example

#1
by YingxuHe - opened
MERaLiON org

Remove flash-attn as a required dependency and remove attn_implementation="flash_attention_2" from the GPU inference example.

The model works with PyTorch's built-in SDPA attention which is auto-selected by transformers when flash-attn is not installed.

MERaLiON org

Closing β€” opened as discussion instead of PR by mistake.

YingxuHe changed discussion status to closed

Sign up or log in to comment