YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct/blob/main/LICENSE language:

en pipeline_tag: text-generation base_model: Qwen/Qwen2.5-7B tags:
chat

MemAgent — `agentic/memagent/`

Our code hub is :https://github.com/LMIS-ORG/slime-agentic

Reproduces the core idea of MemAgent: compressing arbitrarily long documents into a fixed-size recurrent memory via a chunk-by-chunk LLM update loop, then answering questions from memory alone. RL (GRPO) is applied to all memory-update turns using a Multi-Conversation training objective, so the model learns to retain what matters across chunks without ever seeing the full context at once.

Architecture

Input: question + long document
  │
  ▼
memory = "No previous memory"
  │
  └─► for chunk in split(document, chunk_tokens):
        │
        └─ LLM(problem, memory, chunk) → updated memory   (loss_mask=1)
  │
  ▼
LLM(problem, memory) → final answer in \boxed{}           (loss_mask=0)
  │
  ▼
Reward: exact-match / F1 against ground truth
         (distributed evenly across all memory-update turns)

Each memory-update turn is an independent training sequence. The reward is evenly amortised across all turns in the conversation (via custom_convert), matching the Multi-Conv RL objective in the MemAgent paper.

Results

Evaluated on RULER-HQA across context lengths from 7K to 448K tokens (5 runs, best score reported):

Model	7K	14K	28K	56K	112K	224K	448K
MemAgent (ours)	78.12	76.56	75.78	74.22	77.34	72.66	69.53
QwenLong-L1-32B	72.66	75.00	72.66	60.94	31.25	17.19	13.28
Qwen2.5-Instruct-14B-1M	60.16	60.94	50.00	57.03	50.00	37.50	8.59
Qwen2.5-Instruct-7B-1M	61.72	56.25	53.91	55.47	51.56	33.59	12.50
DS-Distill-Qwen-32B	70.31	66.41	65.62	46.88	23.44	13.28	7.81
DS-Distill-Qwen-14B	64.06	64.84	57.03	40.62	14.84	8.59	3.12
DS-Distill-Qwen-7B	30.47	12.50	3.12	0.00	0.00	0.78	0.00

MemAgent (ours) is trained on a 7B base model and consistently outperforms all baselines, including much larger models, across all context lengths.

Downloads last month: 516

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for LMIS-ORG/MemAgent_Slime_Agentic_Qwen2.5_7B

MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent

Paper • 2507.02259 • Published Jul 3, 2025 • 5

MemAgent — agentic/memagent/

Architecture

Results

Paper for LMIS-ORG/MemAgent_Slime_Agentic_Qwen2.5_7B

MemAgent — `agentic/memagent/`