liumy2010 's Collections

UFT

UFT: Unifying Supervised and Reinforcement Fine-Tuning