Effective Distillation to Hybrid xLSTM Architectures Paper • 2603.15590 • Published about 1 month ago • 33
Effective Distillation to Hybrid xLSTM Architectures Paper • 2603.15590 • Published about 1 month ago • 33
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated Dec 23, 2025 • 51
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 1 day ago • 137
view article Article xLSTM-based time series model TiRex significantly outperforms competing models in forecasting accuracy Jun 4, 2025 • 12