FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference Paper • 2502.20766 • Published Feb 28, 2025 • 1
Model Merging in Pre-training of Large Language Models Paper • 2505.12082 • Published May 17, 2025 • 39