Llama3-8B-merge-biomed-wizard (MindNLP Wizard Reproduction)

Wizard

This is a DARE-TIES merge reproduction of Llama3-8B-Instruct + NousResearch/Hermes-2-Pro-Llama-3-8B + aaditya/Llama3-OpenBioLLM-8B.

The overall merge recipe and benchmark setup follow lighteternal/Llama3-merge-biomed-8b, while the actual merge implementation is performed with MindNLP Wizard on MindSpore/Ascend.

Implementation Statement

  • Merge engine: MindNLP Wizard
  • Runtime stack: MindSpore + Ascend
  • Output dtype: bfloat16

Usage

Prompt template recommendation remains the Llama3 format: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/

Leaderboard Metrics (Open LLM Leaderboard style)

Task Metric Ours (Wizard, %) Llama3-8B-Instruct (%) OpenBioLLM-8B (%)
ARC Challenge Accuracy 59.73 57.17 55.38
Normalized Accuracy 64.59 60.75 58.62
HellaSwag Accuracy 62.26 62.59 61.83
Normalized Accuracy 81.35 81.53 80.76
Winogrande Accuracy 76.01 74.51 70.88
GSM8K Accuracy 70.81 68.69 10.15
MMLU-Anatomy Accuracy 71.11 72.59 69.62
MMLU-Clinical Knowledge Accuracy 77.74 77.83 60.38
MMLU-College Biology Accuracy 80.56 81.94 79.86
MMLU-College Medicine Accuracy 68.21 63.58 70.52
MMLU-Medical Genetics Accuracy 82.00 80.00 80.00
MMLU-Professional Medicine Accuracy 77.57 71.69 77.94

Merge Details

Merge Method

This model is merged using the DARE-TIES method with meta-llama/Meta-Llama-3-8B-Instruct as base.

Models Merged

The following donor models are included in the merge:

Configuration

The following YAML configuration is used:

models:
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    # Base model providing a general foundation without specific parameters

  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.60
      weight: 0.5

  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
    parameters:
      density: 0.55
      weight: 0.1

  - model: aaditya/Llama3-OpenBioLLM-8B
    parameters:
      density: 0.55
      weight: 0.4

merge_method: dare_ties
base_model: meta-llama/Meta-Llama-3-8B-Instruct
parameters:
  int8_mask: true
dtype: bfloat16

Reproducibility Notes

  • Few-shot settings:
    • ARC Challenge: 25-shot
    • HellaSwag: 10-shot
    • Winogrande: 5-shot
    • GSM8K: 5-shot
    • MMLU-* subsets: 5-shot

Environment (Inference / Evaluation)

  • Accelerator: Ascend 910B2
  • MindSpore: 2.7.1

References

Downloads last month
33
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for chenjingshen/Llama3-8B-merge-biomed-wizard

Papers for chenjingshen/Llama3-8B-merge-biomed-wizard