Efficient Training on Multiple Consumer GPUs with RoundPipe Paper • 2604.27085 • Published 9 days ago • 38
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published Apr 6 • 112