French prompts datasets Collection French prompts dataset developped when I worked at CATIE (https://hf.co/CATIE-AQ). Over 90,000 downloads. • 5 items • Updated Feb 8
French embedding datasets Collection French datasets to train embeddings models or evaluate them. • 3 items • Updated Feb 8
French NER Collection NER models & datasets developped when I worked at CATIE (https://hf.co/CATIE-AQ). Over 185,000 downloads. • 11 items • Updated Feb 8
French QA Collection QA models & datasets developped when I worked at CATIE (https://hf.co/CATIE-AQ). Over 160,000 downloads. • 6 items • Updated Feb 8
French VQA datasets Collection VQA datasets I cleaned with an image, a question and an answer. Can be used to train VLMs. • 18 items • Updated Feb 8
French caption datasets Collection Datasets I cleaned with an image, a prompt question (like "describe this image") and an answer. Can be used to train VLMs. • 9 items • Updated Feb 8
French OCR datasets Collection Datasets I cleaned with an image, a prompt question (like "transcribe the text in this image") and an answer. Can be used to train VLMs. • 3 items • Updated Feb 8
French retriever datasets Collection Datasets I cleaned with an image and a question. Can be used to train visual retrievers (ColPali and co.). • 4 items • Updated Feb 8
French table-to-text datasets Collection In 2021 before the release of LoRA, I was interested in Prefix-tuning, which I wanted to apply to French. So I had to translate table-to-text data • 3 items • Updated Feb 8
French audio datasets (pretraining) Collection Around 117K hours of audio in French for research purpose • 51 items • Updated Mar 2 • 1