Softpick
Collection
Pretrained models from the paper "Softpick: No Attention Sink, No Massive Activations with Rectified Softmax" • 5 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
See code: https://github.com/zaydzuhri/softpick-attention
This model is only usable through these repositories: https://github.com/zaydzuhri/flash-linear-attention/tree/softpick-attention https://github.com/zaydzuhri/flame/tree/softpick-attention