Finetuning `chandra`

by johnlockejrr - opened Feb 20

Feb 20

Hi! I saw you finetuned chandra with grpo. I was thinking to try finetune it with unsloth + LoRA and SFT or try to add it to Llamafactory keeping in mind it is a Qwen3VLForConditionalGeneration model. How did you do it? There's no info on finetuning chandra from the authors. Thank you!

suv11235

Owner Feb 20

Hi, thanks for your query; haven't gotten around to adding the model cards. I used GRPOTrainer from trl with vllm for generations. I have been thinking about moving over to other frameworks as the trl support is not amazing. Thanks for the ideas on some alternatives!

johnlockejrr

Feb 20

I just successfully finetuned it with unsloth+LoRA+SFT. Still distilling the method but it works, the model learns very well and fast.
https://github.com/johnlockejrr/chandra_finetune

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment