Query about model training details
I'm really impressed by your fine-tuned NANONET_CORRECT_V1. I am currently conducting research on model lineage and sub-model relationships based on this architecture, and your work provides an excellent case study.
I would love to gain more insights into your training recipe:
FT Details: Could you share the key hyperparameters (e.g., learning rate, scheduler, and total epochs) and the dataset composition/size?
Version Iteration: Regarding the relationship among V1, V2, and V3, were they fine-tuned independently from the same base model, or were they developed sequentially (i.e., V2 is a further fine-tune of V1)?
Your insights would be invaluable for my research. We can discuss here, or if you prefer a more detailed technical exchange, I’d be happy to follow up via email.
Thanks for your great contribution to the community!
thanks but on what did u test this model on ?
Thanks for your response! I would like to test it primarily on OCR post-processing tasks
Could you share a bit about the hyperparameters, dataset, and how V1/V2/V3 relate? Even a high-level overview would help me. Appreciate your time!!!
mostly this was sfted on a private dataset which i cant disclose as its NDA,hyperparams were more of experimetal and I dont rememebr it tbh sorry!
Thanks for the clarification! Regarding the model lineage, I just want to confirm the high-level fine-tuning path for my research. Was it:
A) Parallel: Base -> V1, Base -> V2, Base -> V3 (independent runs)
B) Sequential: Base -> V1 -> V2 -> V3 (iterative fine-tuning)
Thanks again for your time!