Harnessing Optimization Dynamics for Curvature-Informed Model Merging
Paper • 2509.11167 • Published • 1
Llama-3.1-8B SFT checkpoints for mathematical reasoning—artifacts of https://arxiv.org/abs/2509.11167.
This repository includes export files for state averaging and other advanced techniques.
Base model
meta-llama/Llama-3.1-8B