Code

#2
by chandanwaa - opened

Could u share the complete pipeline of the training to obtain this model?

Hey @chandanwaa !
You can find both the TRL-based native GRPO and the custom GRPO implementation from scratch here.
Link: https://huggingface.co/blog/prithivMLmods/smollm-grpo-ft

prithivMLmods changed discussion status to closed

Sign up or log in to comment