Question about finetuning dataset

#1
by xiahao2 - opened

Hello, may I ask what dataset you used for fine-tuning, what its license is, and how much data was used?

Best regards

Hi! I used the FrancophonIA/English-French dataset. The license I assume CC0: Public Domain, since the dataset was extracted from Kaggle. And I used the whole thing to fine-tune.
This model was an exercise part of the LLM Course (from HF), in Section 7.3.

I understand. Thank you very much for your explanation.

xiahao2 changed discussion status to closed

Sign up or log in to comment