DeltaKV: Residual-Based KV Cache Compression via Long-Range Similarity Paper • 2602.08005 • Published Feb 8 • 1
MOVA: Towards Scalable and Synchronized Video-Audio Generation Paper • 2602.08794 • Published Feb 9 • 159
A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone Paper • 2505.12781 • Published May 19, 2025 • 2
Low-Rank Clone (LRC) Collection Model checkpoints for paper "A Token is Worth over 1,000 Tokens: Efficient Knowledge Distillation through Low-Rank Clone". • 7 items • Updated Mar 2 • 1
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 41 items • Updated Mar 2 • 150