Use default attention implementation with option to override
#2
by nvidia-oliver-holworthy - opened
No description provided.
Added code for enabling sdpa and eager here: https://huggingface.co/nvidia/llama-nemotron-rerank-vl-1b-v2/discussions/3
Replaced by #4
nvidia-oliver-holworthy changed pull request status to closed