Spaces:

MedInjection
/

README

Running

App Files Files Community

README / README.md

ik-ram28

Update README.md

b20f71f verified about 1 month ago

preview code

raw

history blame contribute delete

1 kB

metadata

title: README
emoji: 👁
colorFrom: purple
colorTo: gray
sdk: static
pinned: false

🩹 MedInjection-FR

A French biomedical instruction dataset and model suite for studying how data provenance (native, synthetic, translated) impacts instruction-tuning of LLMs.

📊 Dataset Stats

Total size: 571,436 instruction–response pairs

Components:

Native: 77,247
Synthetic: 76,506
Translated: 417,674

Tasks:

MCQU (single-answer)
MCQ (multi-answer)
OEQ (open-ended)

Paper

@misc{belmadani2026medinjectionfrexploringrolenative,
      title={MedInjection-FR: Exploring the Role of Native, Synthetic, and Translated Data in Biomedical Instruction Tuning}, 
      author={Ikram Belmadani and Oumaima El Khettari and Pacôme Constant dit Beaufils and Benoit Favre and Richard Dufour},
      year={2026},
      eprint={2603.06905},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.06905}, 
}