consider use RL after SFT

by BryanADA - opened 14 days ago

Discussion

BryanADA

14 days ago

•

edited 14 days ago

I think RL could boost the ability of Gemopus after SFT. Would you like to try it in the next version of Gemopus?

Jackrong

Owner 14 days ago

Thanks for the suggestion! I’ve actually been exploring RL training on Qwopus, but there are still some challenges—especially with multimodal compatibility and algorithm optimization, which aren’t fully resolved yet.

That said, I definitely plan to incorporate RL in future training once these issues are addressed.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment