Flux Attention: Context-Aware Hybrid Attention for Efficient LLMs Inference Paper • 2604.07394 • Published 12 days ago • 16
QQTang1223/full_streaming_Llama-3.1-8B-Instruct Text Generation • 8B • Updated 9 days ago • 296