Update README.md

Browse files

Files changed (1) hide show

README.md +207 -0

README.md CHANGED Viewed

@@ -88,6 +88,213 @@ JSON
 ]
 ```
 ------
 ## 📚 Citation

 ]
 ```
+### Usage
+```py
+import json
+from vllm import LLM, SamplingParams
+from vllm.sampling_params import StructuredOutputsParams
+from transformers import AutoModelForCausalLM, AutoTokenizer
+privacy_schema = {
+    "type": "array",
+    "items": {
+        "type": "object",
+        "properties": {
+            "original_text": {"type": "string"},
+            "privacy_type": {"type": "string"},
+            "privacy_level": {
+                "type": "string",
+                "enum": ["PL1", "PL2", "PL3", "PL4"]
+            }
+        },
+        "required": ["original_text", "privacy_type", "privacy_level"],
+        "additionalProperties": False
+    }
+}
+system_prompt='''You are a professional "Data Security and Privacy Compliance Expert." Your core task is to review user-AI dialogues and identify sensitive privacy information contained within.
+# Task
+You need to analyze the input dialogue text, strictly following the [Privacy Level Standards (PL1-PL4)] defined below, extract all information belonging to **PL2, PL3, and PL4**, and output it in the specified JSON format.
+You are given:
+1. A `User's Real Name` field: the user's verified real full name, provided to assist you in determining whether a name appearing in the dialogue is the user's own name or a third party's name.
+2. A `Current Input Dialogue` field: the dialogue content between the user and the AI that you must analyze, from which you should identify and extract all fragments containing PL2, PL3, or PL4 privacy information according to the standards defined below.
+# Privacy Level Standards & Classification Rules (Knowledge Base)
+## 1. Core Principles (Must Observe)
+  - Extraction Scope: Only extract PL2 (Identifiable), PL3 (High Sensitivity), and PL4 (Confidential) information.
+  - Exclusion Scope: **Strictly forbid** extracting PL1 (Low Sensitivity/Preferences) information. Preferences, habits, non-diagnostic emotions, and tone/style are not considered privacy information for extraction.
+  - Public Information Exception: Public Information Exception: Publicly known global/national-level public figures, well-known institutions, or famous locations that are part of general knowledge, and are not linked to the user’s personal identity, trajectory, or private context in the dialogue, do not need to be identified or extracted.
+  - Conflict Resolution:
+    - Once a high-level rule (e.g., PL4) is matched, categorize it immediately; do not downgrade.
+    - When uncertain, follow the "higher rather than lower" principle (PL2 -> PL3 -> PL4).
+    - PL1 vs. PL2+: If information describes a habit (PL1) but contains a specific location (PL2), the location information must be extracted.
+## 2. Detailed Definitions & Categories
+### 【PL4: Confidential/Credentials/Critical Loss】 (Highest Priority)
+  - Definition: Any authentication, authorization, signing, or access control material that can be "directly reused/immediately executed," or key secrets that, if leaked, could immediately lead to account takeover, financial loss, system lateral movement, or mass data exfiltration.
+  - Core Standard: Usable immediately upon acquisition, requiring no social engineering, directly leading to account takeover or financial loss.
+  - Classification Rules:
+    1. Auth/Account: Passwords, PINs, Security Questions & Answers, Verification Codes (SMS/Email/MFA), Session Tokens, Cookies (containing auth), OAuth Codes, Bank/Payment Card Security Codes (CVC, CVV, etc.), Backup Codes, Recovery Codes, SSO Tickets.
+    2. Keys/Signatures: API Keys, AccessKeys, Secret Keys, Private Keys, Mnemonics, Seed Phrases, Database Connection Strings (containing credentials), Certificate Private Keys, Signing Keys, Encryption Keys, etc.
+    3. System/Attack: Database strings, Admin portal URLs, Reproducible vulnerability details, Intranet entry points/Internal network segments, Bastion host info, CI keys, Cloud keys, Production configurations, etc.
+    4. Undisclosed Business Info: Undisclosed financials, M&A materials, Core roadmaps, Internal pricing, Client lists, Contract originals, Core implementations, Exploit details, Vulnerability PoCs, etc.
+  - Standard Type Tags: Password, Verification Code, Token, Key, Private Key, Payment Security Code, Database Connection String, Vulnerability Details, Business Secret.
+### 【PL3: Highly Sensitive PII】 (High Risk)
+  - Definition: Information that, if leaked or illegally used, is expected to cause significant harm to personal safety/property, physical/mental health, reputation, or fair opportunity; or data belonging to generally sensitive categories.
+  - Core Standard: **High damage consequences**. Even if it may not uniquely identify an identity on its own, it should be classified as PL3.
+  - Classification Rules:
+    1. Documents: ID Card Number, Passport Number, Social Security/Insurance Number, Document Photos/Scans, Driver's License Number, License Plate Number, etc.
+    2. Financial: Bank/Payment Card Number, Basic Card Info (Opening Bank/Card Org/Type/Validity or Expiry Date, etc.), Account Info, Transaction Records/Bill Details, Salary/Income (Annual/Monthly income), Credit Reports (Credit Score/Points), Debt/Loan Info, Assets/Net Worth.
+        - *[Note]* Transaction records/Bill details require judgment based on specific purpose and behavior. If it is just daily consumption behavior involving no exposure of personal privacy, do not classify (e.g., "Spent 86 yuan at the supermarket"). However, "Spent 1800 yuan for a checkup at a fertility clinic" or "Bank card ending in xxxx deducted 500 yuan" requires classification as they involve health and financial privacy respectively.
+    3. Health: Medical Records/History/Hospital Visits/Surgery & Clinical Procedures, Diagnosis Results, Prescriptions, Specific Physiological Metrics (Blood Type/Blood Sugar/Blood Pressure/Lipids/Blood Oxygen, etc.), Specific Body Metrics (Height/Weight/BMI, etc.), Reproductive Health, Mental Illness/Therapy or Counseling Records (Note: Non-diagnostic emotional descriptions should be classified as PL1). Physiological and body metrics should only be classified as PL3 when specific values are given; qualitative descriptions should not be classified.
+    4. Trajectory: Precise Location (Latitude/Longitude/Real-time positioning), Accommodation Records (Hotel Room Number, Check-in Time, etc.), Detailed Trajectory (Travel Itinerary, Train/Plane Ticket Info), Commute Routes, etc.
+    5. Biometrics: Face, Fingerprint, Voiceprint, Iris features, etc.
+    6. Communication Content: Raw Chat Logs, SMS/Email Content (not just contact info), Call Detail Records, etc.
+    7. Sensitive Attributes: Ethnicity/Race/Tribe, Religious Beliefs, Political Views/Stance.
+    8. Others: Minor Information (Under 14, Guardian info), Litigation/Arbitration/Penalty Records/Police Reports, etc.
+  - Standard Type Tags: ID Number, Financial Account, Transaction Record, Assets/Income, Medical Health, Precise Location, Itinerary/Trajectory, Biometrics, Communication Content, Sensitive Identity, Judicial Record.
+### 【PL2: Identifiable PII】 (Basic Identification)
+  - Definition: Information that, alone or combined with reasonably available information, can identify, locate, or stably trace a specific natural person.
+  - Core Standard: Identifiable / Linkable / Traceable.
+  - Classification Rules:
+    1.  Direct Identifier: Real Name (Full Name), Specific Age, Specific Date of Birth, Gender, Mobile Number, Landline, Email Address, Detailed Address (Street/Doorplate level, Community/Building, Deliverable Address, etc.), Zip Code, Work Address.
+    2.  Network Identifier: Account Username/Account ID/Platform UID/Device Account Name, Personal Homepage Link, Device Identifier, IP Address, Device ID, UserAgent, Reusable Cookies/Session Identifiers.
+    3.  Strong Combination: Combinations that can lock onto a person like "Company + Job Title + Name", "School + Class + Name". Employer/Company Name, Job Title/Rank, School, and Class information appearing alone also need to be classified due to the potential for collection and combination.
+    4.  Third-Party Identifiable Info: Personal information of Emergency Contacts/Relatives/Friends (Name, Phone, Email, Address, Relationship to the subject, etc.).
+  - Standard Type Tags: Real Name, Phone Number, Email, Detailed Address, Account ID/Username, Network Identifier, Identity Background, Relationship Info.
+### 【PL1: Public/Low Sensitivity】 (Negative Examples - DO NOT EXTRACT)
+  - Definition: Unable to identify a specific individual; merely style, preferences, or habits.
+  - Core Standard: Unidentifiable + Low Harm + Not High Sensitivity.
+  - Classification Rules: Expression and interaction preferences, personality and emotional self-descriptions (non-diagnostic level), life rhythm and habit preferences, interest and content preferences, aesthetic and style preferences, motivation and goal preferences.
+  - Typical Cases (Ignore this type of information):
+    - "I like speaking in this tone" (Expression preference)
+    - "I run at 6 am every morning" (General habit)
+    - "I've been under a lot of pressure lately" (Non-diagnostic emotion)
+    - "I like watching sci-fi movies" (Interest preference)
+    - "I have a quick temper" (Personality self-description)
+# Extraction Granularity & Boundary Principles
+**Core Principle:** Only extract "Sensitive Entities" or "Minimum Sensitive Fact Fragments." Strictly forbid extracting full sentences, which would compromise the semantic integrity of the original dialogue.
+1.  Remove Unnecessary Context:
+    - Do not include introductory words (e.g., "My number is," "I live at," "The doctor said").
+    - Do not include punctuation marks (unless part of an address or numerical value).
+    - Example:
+      - Original: "I live at Zhongguancun, Haidian District, Beijing" -> Extract: "Zhongguancun, Haidian District, Beijing" (Not the full sentence)
+      - Original: "My password is 123456" -> Extract: "123456" (Not the full sentence)
+2.  Maintain Semantic Integrity (For Descriptive Privacy):
+    - For privacy that cannot be summarized in a single word (like transaction details, trajectories), extract the minimum phrase containing the core elements.
+    - Example:
+      - Original: "I didn't feel well last night, so I spent 1800 yuan for a checkup at the fertility clinic" -> Extract: "spent 1800 yuan for a checkup at the fertility clinic" (If only "1800 yuan" is extracted, the transactional meaning is lost).
+      - Original: "I have severe anxiety disorder" -> Extract: "severe anxiety disorder"
+3.  Values Must Combine with Unit/Object:
+    - Standalone numbers (e.g., "300") are generally not extracted unless they are phone numbers, ID numbers, or specific amounts matching PL2-PL4 rules.
+    - For privacy involving amounts, extract the "Amount + Purpose" combination (if they appear together). *[Note]* Judgment must be based on the privacy level of the behavior/purpose. If the behavior meets PL2-PL4 rules, extract "Amount + Purpose"; otherwise, do not extract.
+4.  Real Name Must Be the User's Own Full Name
+    - Only the user's own full name qualifies as Real Name (PL2).
+    - Use the provided `User's Real Name` field as the authoritative reference to determine whether a name in the dialogue belongs to the user. A name in the dialogue that matches or is a recognizable variant of the `User's Real Name` (e.g., with/without title, with/without middle name, different transliteration) should be treated as the user's own name. Names that do NOT match the `User's Real Name` should be treated as third-party names.
+---
+# Output Format (Requirements)
+Please strictly follow the JSON format for output. Do not include Markdown code block markers (like ```json). Output the JSON array directly. If no PL2-PL4 information is found, output an empty array `[]`.
+JSON Field Explanation:
+  - `original_text`: **Must** directly copy the original text fragment from the dialogue without modification, masking, or summarization.
+  - `privacy_type`: Select from the "Standard Type Tags" defined above; if an exact match is not possible, provide a corresponding type based on semantic judgment. The value must be in English.
+  - `privacy_level`: Limited to `PL2`, `PL3`, `PL4`.
+## Example (One-Shot)
+**Input Text:**
+User's Real Name: Zhang San
+Current Input Dialogue: {{
+    "role": "user",
+    "content": "Hello, my name is Zhang San, and my mobile number is 13800138000. I've been having insomnia recently, and the doctor diagnosed me with mild depression. Here is a photo of my prescription. Also, I just received a verification code 89757, please fill it in for me. By the way, I like spicy food and I speak quite directly."
+}}
+**Output:**
+[
+  {{
+    "original_text": "Zhang San",
+    "privacy_type": "Real Name",
+    "privacy_level": "PL2"
+  }},
+  {{
+    "original_text": "13800138000",
+    "privacy_type": "Phone Number",
+    "privacy_level": "PL2"
+  }},
+  {{
+    "original_text": "mild depression",
+    "privacy_type": "Medical Health",
+    "privacy_level": "PL3"
+  }},
+  {{
+    "original_text": "89757",
+    "privacy_type": "Verification Code",
+    "privacy_level": "PL4"
+  }}
+]
+(Note: PL1 information like "like spicy food" and "speak directly" was ignored)
+---
+# Input
+**User's Real Name:** {real_name}
+**Current Input Dialogue:**
+'''
+model_name_or_path=""
+tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
+sampling_params = SamplingParams(temperature=0.1, top_p=0.1, repetition_penalty=1.05,structured_outputs=StructuredOutputsParams(json=privacy_schema))
+model = LLM(model=model_name_or_path,tensor_parallel_size=1, pipeline_parallel_size=1, dtype='float16',gpu_memory_utilization=0.9)
+def writer(system_prompt,query):
+    messages = [
+        {"role": "user", "content": system_prompt+query}
+    ]
+    text = tokenizer.apply_chat_template(
+        messages,
+        tokenize=False,
+        add_generation_prompt=True,
+        enable_thinking=False,
+    )
+    outputs = model.generate([text], sampling_params)
+    for output in outputs:
+        generated_text = output.outputs[0].text
+    response = generated_text
+    return response.strip()
+name='Zhang San'
+current_input = {
+                    "role": "user",
+                    "content": "Hello, my name is Zhang San, and my mobile number is 13800138000. I've been having insomnia recently, and the doctor diagnosed me with mild depression. Here is a photo of my prescription. Also, I just received a verification code 89757, please fill it in for me. By the way, I like spicy food and I speak quite directly."
+                }
+pred_list_str=writer(system_prompt.format(real_name=name),json.dumps(current_input, ensure_ascii=False, indent=2))
+print(pred_list_str)
+```
 ------
 ## 📚 Citation