SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28, 2025 • 125
Systran/faster-whisper-large-v3 Automatic Speech Recognition • Updated Nov 23, 2023 • 741k • 557
deepseek-ai/DeepSeek-R1-0528 Text Generation • 685B • Updated May 29, 2025 • 769k • • 2.42k