Model Stock: All we need is just a few fine-tuned models
Paper • 2403.19522 • Published • 14
Part of a multi merge experiment. The idea behind it is to create 3 individual models:
The three models above will then be combined into:
I will be using differnet merge methods for these merges in an attempt to find the best combinations hence the Alpha, Beta and Delta tags you will see on each which represent different merge methods.
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Model Stock merge method using meta-llama/Llama-3.3-70B-Instruct as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: TareksLab/Pathos-Alpha-LLaMa-70B
- model: TareksLab/Ethos-Alpha-LLaMa-70B
- model: TareksLab/Logos-Alpha-LLaMa-70B
base_model: meta-llama/Llama-3.3-70B-Instruct
merge_method: model_stock
dtype: float32
out_dtype: bfloat16
tokenizer:
source: TareksLab/Pathos-Alpha-LLaMa-70B