hermite_sakana-3models-7b-hermite-optimal

Hermite 補間で最適化された λ によるモデルマージ。

Merge Configuration

Parameter Value
Method Hermite interpolation (Phase 2 optimized)
λ [0.266233, 0.422976, 0.310791]
dtype torch.float16
  • Model 0 (augmxnt/shisa-gamma-7b-v1): λ=0.266233
  • Model 1 (WizardLMTeam/WizardMath-7B-V1.1): λ=0.422976
  • Model 2 (GAIR/Abel-7B-002): λ=0.310791

Tokenizer

Union tokenizer (mergekit-style): vocab size = 32032

Formula

θ* = Σ_k λ_k θ_k

The mixing weights λ were optimized by minimizing the Hermite polynomial approximation of the loss function (see Phase 2).

Downloads last month
2
Safetensors
Model size
7B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lejelly/hermite_sakana-3models-7b-hermite-optimal

Base model

GAIR/Abel-7B-002
Finetuned
(2)
this model