SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper β’ 2501.17161 β’ Published Jan 28, 2025 β’ 125
ModernBERT workhorses. Collection A collection of powerful - but light - models to annotate data. β’ 4 items β’ Updated Sep 23, 2025 β’ 1