Code
#2
by chandanwaa - opened
Could u share the complete pipeline of the training to obtain this model?
Hey @chandanwaa !
You can find both the TRL-based native GRPO and the custom GRPO implementation from scratch here.
Link: https://huggingface.co/blog/prithivMLmods/smollm-grpo-ft
prithivMLmods changed discussion status to closed