YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.


license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct/blob/main/LICENSE language:

  • en pipeline_tag: text-generation base_model: Qwen/Qwen2.5-7B tags:
  • chat

MemAgent β€” agentic/memagent/

Our code hub is :https://github.com/LMIS-ORG/slime-agentic

Reproduces the core idea of MemAgent: compressing arbitrarily long documents into a fixed-size recurrent memory via a chunk-by-chunk LLM update loop, then answering questions from memory alone. RL (GRPO) is applied to all memory-update turns using a Multi-Conversation training objective, so the model learns to retain what matters across chunks without ever seeing the full context at once.

Architecture

Input: question + long document
  β”‚
  β–Ό
memory = "No previous memory"
  β”‚
  └─► for chunk in split(document, chunk_tokens):
        β”‚
        └─ LLM(problem, memory, chunk) β†’ updated memory   (loss_mask=1)
  β”‚
  β–Ό
LLM(problem, memory) β†’ final answer in \boxed{}           (loss_mask=0)
  β”‚
  β–Ό
Reward: exact-match / F1 against ground truth
         (distributed evenly across all memory-update turns)

Each memory-update turn is an independent training sequence. The reward is evenly amortised across all turns in the conversation (via custom_convert), matching the Multi-Conv RL objective in the MemAgent paper.

Results

Evaluated on RULER-HQA across context lengths from 7K to 448K tokens (5 runs, best score reported):

Model 7K 14K 28K 56K 112K 224K 448K
MemAgent (ours) 78.12 76.56 75.78 74.22 77.34 72.66 69.53
QwenLong-L1-32B 72.66 75.00 72.66 60.94 31.25 17.19 13.28
Qwen2.5-Instruct-14B-1M 60.16 60.94 50.00 57.03 50.00 37.50 8.59
Qwen2.5-Instruct-7B-1M 61.72 56.25 53.91 55.47 51.56 33.59 12.50
DS-Distill-Qwen-32B 70.31 66.41 65.62 46.88 23.44 13.28 7.81
DS-Distill-Qwen-14B 64.06 64.84 57.03 40.62 14.84 8.59 3.12
DS-Distill-Qwen-7B 30.47 12.50 3.12 0.00 0.00 0.78 0.00

MemAgent (ours) is trained on a 7B base model and consistently outperforms all baselines, including much larger models, across all context lengths.

Downloads last month
516
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for LMIS-ORG/MemAgent_Slime_Agentic_Qwen2.5_7B