BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models Paper • 2602.04163 • Published Feb 4 • 10
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers Paper • 2602.15322 • Published Feb 17 • 10
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models Paper • 2602.15772 • Published Feb 17 • 7
Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings Paper • 2602.13823 • Published Feb 14 • 9
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published Feb 15 • 53