---
base_model:
- Qwen3-4B
language:
- en
- zh
license: cc-by-nc-nd-4.0
pipeline_tag: text-generation
library_name: transformers
tags:
- privacy
- privacy-detection
- memory
- personalized-memory
- memory-system
- memory-management
- agent
- agent-memory
- information-security
- information-extraction
- edge-cloud
inference: false
---
🛡️ MemPrivacy-4B-RL
MemPrivacy-4B-RL is a lightweight, privacy-preserving model developed from the Qwen3-4B base model and further optimized through reinforcement learning. It was introduced in the paper [MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents](https://huggingface.co/papers/2605.09530).
It is designed specifically for personalized memory management in edge-cloud agents, enabling more reliable, adaptive, and privacy-aware memory operations. This model functions as the core local extraction engine within the **MemPrivacy framework**. Instead of relying on aggressive masking that destroys task-relevant semantics, the model accurately identifies privacy-sensitive spans on edge devices, categorizes them according to a four-level privacy taxonomy, and replaces them with semantically structured, type-aware placeholders (e.g., ``) before transmitting data to the cloud.
---
## ✨ Key Features & Capabilities
* **High-Precision Privacy Extraction**: Achieves state-of-the-art performance in privacy information extraction, substantially surpassing strong general-purpose reasoning models like GPT-5.2 and Gemini-3.1-Pro.
* **Four-Level Privacy Taxonomy (PL1-PL4)**: Capable of identifying and classifying privacy-relevant content based on identifiability, expected harm, and operational exploitability.
* **Semantic Utility Preservation**: By decoupling privacy protection from semantic destruction, the use of typed placeholders ensures that cloud agents retain the relational and semantic cues required for effective memory formation and retrieval.
* **Edge-Optimized Efficiency**: Designed for resource-constrained local deployment, maintaining high accuracy while significantly reducing inference latency compared to massive general-purpose LLMs.
---
## 🚀 Usage Example
The model accepts conversational text alongside basic user identifiers and extracts a structured list of privacy instances, detailing the original text, the specific privacy type, and its corresponding privacy level.
**Input:**
```text
User Name: Zhang San
Dialogue Text: Hello, my name is Zhang San, and my mobile number is 13800138000. I've been having insomnia recently, and the doctor diagnosed me with mild depression. Here is a photo of my prescription. Also, I just received a verification code 89757, please fill it in for me. By the way, I like spicy food and I speak quite directly.
```
**Output:**
```json
[
{
"original_text": "Zhang San",
"privacy_type": "Real Name",
"privacy_level": "PL2"
},
{
"original_text": "13800138000",
"privacy_type": "Phone Number",
"privacy_level": "PL2"
},
{
"original_text": "mild depression",
"privacy_type": "Medical Health",
"privacy_level": "PL3"
},
{
"original_text": "89757",
"privacy_type": "Verification Code",
"privacy_level": "PL4"
}
]
```
### 📌 Structured Privacy Extraction with vLLM
This example shows how to use vLLM to perform structured privacy information extraction from user-AI dialogues, constrained by a JSON schema.
```py
import json
from vllm import LLM, SamplingParams
from vllm.sampling_params import StructuredOutputsParams
from transformers import AutoTokenizer
privacy_schema = {
"type": "array",
"items": {
"type": "object",
"properties": {
"original_text": {"type": "string"},
"privacy_type": {"type": "string"},
"privacy_level": {
"type": "string",
"enum": ["PL1", "PL2", "PL3", "PL4"]
}
},
"required": ["original_text", "privacy_type", "privacy_level"],
"additionalProperties": False
}
}
model_name_or_path = "IAAR-Shanghai/MemPrivacy-4B-RL"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
sampling_params = SamplingParams(
temperature=0.1,
top_p=0.1,
repetition_penalty=1.05,
max_tokens=6144,
structured_outputs=StructuredOutputsParams(json=privacy_schema)
)
model = LLM(model=model_name_or_path, dtype='float16', gpu_memory_utilization=0.9)
# Example input processing
name = 'Zhang San'
current_input = {
"role": "user",
"content": "Hello, my name is Zhang San, and my mobile number is 13800138000..."
}
# For full implementation details, please refer to the GitHub repository.
```
---
## 📚 Citation
```bibtex
@misc{chen2026memprivacyprivacypreservingpersonalizedmemory,
title={MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents},
author={Yining Chen and Jihao Zhao and Bo Tang and Haofen Wang and Yue Zhang and Fei Huang and Feiyu Xiong and Zhiyu Li},
year={2026},
eprint={2605.09530},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2605.09530},
}
```
## Disclaimers
This project is intended for **privacy research and evaluation**. Do **not** use it to process real user secrets without proper security controls, threat modeling, and compliance review. Always follow local laws and organizational policies.