Question about SFT hyperparameters and dataset
#1
by Tangchiu - opened
I'm really impressed by this fine-tuned Qwen3-VL model. I am also working on sub-models based on this architecture and would love to know more about your SFT hyperparameters (e.g., learning rate, epochs) and the dataset composition.
Would you mind sharing some insights here, or perhaps providing an email address for a more detailed technical discussion?
Thanks for your contribution to the community!