FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents

FoldAct is a framework designed for training long-horizon reinforcement learning (RL) agents with context folding. This model is a 7B parameter version fine-tuned from Qwen2.5-7B-Instruct using the FoldAct framework.

Overview

Long-horizon RL for large language models faces scalability challenges due to unbounded context growth. FoldAct addresses these challenges by compressing interaction history through context folding while maintaining training stability. The framework introduces three key innovations:

  1. Separated Loss Computation: Provides independent gradient signals for summary and action tokens to prevent gradient dilution.
  2. Full Context Consistency Loss: Reduces distribution shifts between folded and full contexts to maintain policy stability.
  3. Selective Segment Training: Improves training efficiency, achieving up to a 5.19x speedup in long-horizon tasks.

Training

Detailed instructions for training agents using the FoldAct framework can be found in the official GitHub repository.

Citation

@article{foldact2025,
  title={FoldAct: Efficient and Stable Context Folding for Long-Horizon Search Agents},
  author={Anonymous},
  journal={arXiv preprint arXiv:2512.22733},
  year={2025}
}
Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Luuvy/foldact-7b-local-qwen-2.5-it

Base model

Qwen/Qwen2.5-7B
Finetuned
(3210)
this model

Paper for Luuvy/foldact-7b-local-qwen-2.5-it