xLSTM Scaling Laws: Competitive Performance with Linear Time-Complexity
Paper • 2510.02228 • Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Paper: http://arxiv.org/abs/2510.02228 (Accepted at ICLR 2026)
Code: https://github.com/NX-AI/xlstm_scaling_laws
Authors: Maximilian Beck, Kajetan Schweighofer, Sebastian Böck, Sebastian Lehner, Sepp Hochreiter
This repository contains the xLSTM checkpoints of the token/param scaling law configuration from our xLSTM scaling law analysis.
To run inference with our scaling law checkpoints, we use the huggingface transformers xLSTM-7B implementation.
We refer to the Readme.md of our xLSTM scaling law github repository for more details.
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-3.67B--id-i5dhe6am
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-5.24B--id-5q50n1xs
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-7.34B--id-9fda092j
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-8.39B--id-cy3br36c
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-18.87B--id-hs5khr4o
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-37.75B--id-uju2tyxb
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-91.23B--id-vufn7vk6
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-181.4B--id-5c16ap2i
315M ./mlstm_v1--tokenparam--ctx-8192--params-164.11M--tokens-361.76B--id-y5s6gd5v
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-10.49B--id-xx828oas
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-18.87B--id-sbug1edk
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-48.23B--id-6ugtb2jc
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-91.23B--id-rthncln6
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-225.44B--id-swn8e6ti-1o4g3ovh
779M ./mlstm_v1--tokenparam--ctx-8192--params-406.86M--tokens-447.74B--id-zxcdigqn
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-20.97B--id-ds89wtop
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-37.75B--id-5m5wsr4o
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-96.47B--id-6k64ww8i
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-188.74B--id-g0zpitjh
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-461.37B--id-zn5ec7lj
1.6G ./mlstm_v1--tokenparam--ctx-8192--params-841.5M--tokens-926.94B--id-m97a5wxv
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-33.55B--id-z87rj8uv
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-65.01B--id-exazu89h
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-159.38B--id-1zy4evam
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-314.57B--id-bwgc7tr8
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-786.43B--id-on5wxj1y
2.7G ./mlstm_v1--tokenparam--ctx-8192--params-1.42B--tokens-1.56T--id-7wwry7p1
5.2G ./mlstm_v1--tokenparam--ctx-8192--params-2.78B--tokens-67.11B--id-vpes3xm6
5.2G ./mlstm_v1--tokenparam--ctx-8192--params-2.78B--tokens-130.02B--id-5pxfizmt
5.2G ./mlstm_v1--tokenparam--ctx-8192--params-2.78B--tokens-318.77B--id-qvdid4dr
5.2G ./mlstm_v1--tokenparam--ctx-8192--params-2.78B--tokens-612.37B--id-b6nbl9yz
13G ./mlstm_v1--tokenparam--ctx-8192--params-6.87B--tokens-306.18B--id-roo8xyr6-cb4q3k1y
13G ./mlstm_v1--tokenparam--ctx-8192--params-6.87B--tokens-159.38B--id-8egjdt0c
13G ./mlstm_v1--tokenparam--ctx-8192--params-6.87B--tokens-759.17B--id-ui1zi0hi
92G .