consider use RL after SFT

#9
by BryanADA - opened

I think RL could boost the ability of Qwopus after SFT. Would you like to try it in the next version of Qwopus?

BryanADA changed discussion status to closed
BryanADA changed discussion status to open

Sign up or log in to comment