Softpick: No Attention Sink, No Massive Activations with Rectified Softmax
Paper • 2504.20966 • Published • 31
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
See code: https://github.com/zaydzuhri/softpick-attention
This model is only usable through these repositories: https://github.com/zaydzuhri/flash-linear-attention/tree/softpick-attention https://github.com/zaydzuhri/flame/tree/softpick-attention