FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents

FoldAct is a framework designed for training long-horizon reinforcement learning (RL) agents with context folding. This model is a 7B parameter version fine-tuned from Qwen2.5-7B-Instruct using the FoldAct framework.

Paper: FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents
Repository: https://github.com/SHAO-Jiaqi757/FoldAct

Overview

Long-horizon RL for large language models faces scalability challenges due to unbounded context growth. FoldAct addresses these challenges by compressing interaction history through context folding while maintaining training stability. The framework introduces three key innovations:

Separated Loss Computation: Provides independent gradient signals for summary and action tokens to prevent gradient dilution.
Full Context Consistency Loss: Reduces distribution shifts between folded and full contexts to maintain policy stability.
Selective Segment Training: Improves training efficiency, achieving up to a 5.19x speedup in long-horizon tasks.

Training

Detailed instructions for training agents using the FoldAct framework can be found in the official GitHub repository.

Citation

@article{foldact2025,
  title={FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents},
  author={Anonymous},
  journal={arXiv preprint arXiv:2512.22733},
  year={2025}
}