Text Generation
Transformers
Safetensors
English
Chinese
qwen3
privacy
privacy-detection
memory
personalized-memory
memory-system
memory-management
agent
agent-memory
information-security
information-extraction
edge-cloud
conversational
text-generation-inference
Instructions to use IAAR-Shanghai/MemPrivacy-1.7B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use IAAR-Shanghai/MemPrivacy-1.7B-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="IAAR-Shanghai/MemPrivacy-1.7B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("IAAR-Shanghai/MemPrivacy-1.7B-SFT") model = AutoModelForCausalLM.from_pretrained("IAAR-Shanghai/MemPrivacy-1.7B-SFT") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use IAAR-Shanghai/MemPrivacy-1.7B-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "IAAR-Shanghai/MemPrivacy-1.7B-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IAAR-Shanghai/MemPrivacy-1.7B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/IAAR-Shanghai/MemPrivacy-1.7B-SFT
- SGLang
How to use IAAR-Shanghai/MemPrivacy-1.7B-SFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "IAAR-Shanghai/MemPrivacy-1.7B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IAAR-Shanghai/MemPrivacy-1.7B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "IAAR-Shanghai/MemPrivacy-1.7B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "IAAR-Shanghai/MemPrivacy-1.7B-SFT", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use IAAR-Shanghai/MemPrivacy-1.7B-SFT with Docker Model Runner:
docker model run hf.co/IAAR-Shanghai/MemPrivacy-1.7B-SFT
Commit History
Update README.md 443f096 verified
Update README.md badc6b4 verified
Update README.md 09d739e verified
Update README.md 95c0c4f verified
Update README.md 130e047 verified
Update README.md cc546e2 verified
Update README.md 3ab9f05 verified
Update README.md 5a144ad verified
Ding Chen commited on
Update README.md 92e6cba verified
Ding Chen commited on
Update README.md e005e60 verified
Ding Chen commited on
Update README.md ad9a252 verified
Ding Chen commited on
Update README.md bc82671 verified
Update README.md 294e7d9 verified
Ding Chen commited on