SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization
Paper β’ 2604.13515 β’ Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Experimental checkpoint from "Data Overlap as a Post-Training Hyperparameter for Autoformalization." This is the SFT-only variant (Qwen3-8B, thinking disabled) trained on 20K heterogeneous (natural-language, Lean 4) pairs. See the paper repo for details, results, and all artifacts.
This model is part of the experiments in:
SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization
Xiaole Su, Kasey Zhang, Andy Lyu
https://arxiv.org/abs/2604.13515