Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models Paper • 2502.08130 • Published Feb 12, 2025 • 9
Granite 2.0 Code Models Collection Code models for generation, understanding, and instruction-following tasks. • 22 items • Updated 14 days ago • 202
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning Jun 11, 2024 • 21