hermite_sakana-3models-7b-hermite-optimal
Hermite 補間で最適化された λ によるモデルマージ。
Merge Configuration
| Parameter | Value |
|---|---|
| Method | Hermite interpolation (Phase 2 optimized) |
| λ | [0.266233, 0.422976, 0.310791] |
| dtype | torch.float16 |
- Model 0 (
augmxnt/shisa-gamma-7b-v1): λ=0.266233 - Model 1 (
WizardLMTeam/WizardMath-7B-V1.1): λ=0.422976 - Model 2 (
GAIR/Abel-7B-002): λ=0.310791
Tokenizer
Union tokenizer (mergekit-style): vocab size = 32032
Formula
θ* = Σ_k λ_k θ_k
The mixing weights λ were optimized by minimizing the Hermite polynomial approximation of the loss function (see Phase 2).
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for lejelly/hermite_sakana-3models-7b-hermite-optimal
Base model
GAIR/Abel-7B-002