SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning
Paper • 2505.16368 • Published
Saturn-1.5B and Saturn-7B are part of the SATURN framework, which utilizes Boolean Satisfiability (SAT) problems to continuously improve language model reasoning through a curriculum learning pipeline.
To use these models, simply load them from Hugging Face’s transformers library, as shown below:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "gtxygyzb/Saturn-7B" # or "gtxygyzb/Saturn-1.5B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
For more detailed information, please refer to:
If you use Saturn-1.5B or Saturn-7B in your research, please cite our work:
@article{saturn2025,
author = {Huanyu Liu and Jia Li and Hao Zhu and Kechi Zhang and Yihong Dong and Ge Li},
title = {SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning},
journal = {CoRR},
volume = {abs/2505.16368},
year = {2025},
}