Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics Paper • 2512.12602 • Published Dec 14, 2025 • 44
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29, 2025 • 80
XQuant: Achieving Ultra-Low Bit KV Cache Quantization with Cross-Layer Compression Paper • 2510.11236 • Published Oct 13, 2025 • 1
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers Paper • 2601.04890 • Published Jan 8 • 44
Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation Paper • 2601.22813 • Published Jan 30 • 60