GemmaThink
Collection
A collection of Gemma3-1b-it models that we post-trained using SFT and GRPO to enhance its reasoning capabilities, using Google's new Tunix library. • 7 items • Updated
This model was trained using SFT (Suprevised FineTuning) to generate structured reasoning traces.
<reasoning>step-by-step thinking process</reasoning>
<answer>final answer</answer>