Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Paper • 2311.03099 • Published • 32
This is a merge of pre-trained language models created using mergekit.
This model was merged using the DARE TIES merge method using aaditya/Llama3-OpenBioLLM-70B as a base.
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr | |
|---|---|---|---|---|---|---|---|
| pubmedqa | 1 | none | 0 | acc | 0.7820 | ± | 0.0185 |
| professional_medicine | 0 | none | 0 | acc | 0.9375 | ± | 0.0147 |
| medical_genetics | 0 | none | 0 | acc | 0.9300 | ± | 0.0256 |
| college_medicine | 0 | none | 0 | acc | 0.8555 | ± | 0.0268 |
| college_biology | 0 | none | 0 | acc | 0.9375 | ± | 0.0202 |
| clinical_knowledge | 0 | none | 0 | acc | 0.9283 | ± | 0.0159 |
| anatomy | 0 | none | 0 | acc | 0.8444 | ± | 0.0313 |
| medqa_4options | Yaml | none | 0 | acc | 0.7777 | ± | 0.0117 |
| none | 0 | acc_norm | 0.7777 | ± | 0.0117 | ||
| medmcqa | Yaml | none | 0 | acc | 0.7423 | ± | 0.0068 |
| none | 0 | acc_norm | 0.7423 | ± | 0.0068 |
Average: 0.8594
Base model
meta-llama/Meta-Llama-3-70B