Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference Paper • 2604.07394 • Published 6 days ago • 14
Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference Paper • 2604.07394 • Published 6 days ago • 14
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published Jan 24 • 34
LongRM: Revealing and Unlocking the Context Boundary of Reward Modeling Paper • 2510.06915 • Published Oct 8, 2025 • 16
LOOM-Scope: a comprehensive and efficient LOng-cOntext Model evaluation framework Paper • 2507.04723 • Published Jul 7, 2025 • 12