Commit History

Upload PPO-aligned TinyLlama-1.1B model using MARS reward model on PKUSafeRLHF
8f21213
verified

payelb commited on