Spaces:
Sleeping
Sleeping
File size: 4,884 Bytes
56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 56257d2 3c7cc35 bc2ecc6 56257d2 3c7cc35 56257d2 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | ---
title: GDPR Auditor
emoji: π
colorFrom: purple
colorTo: red
sdk: docker
app_port: 7860
---
# π GDPR Compliance Auditor β OpenEnv Environment
**GDPR Auditor** is an OpenEnv-compatible RL environment where AI agents act as autonomous compliance officers, auditing privacy policies for GDPR/CCPA violations, detecting dark patterns, and identifying policy contradictions.
---
## The Problem It Solves
Every company needs compliance auditing to avoid massive fines:
- GDPR fines up to **β¬20 million** or **4% of global revenue**
- CCPA fines up to **$7,500 per violation**
- Average human compliance auditor cost: **$100,000+/year**
### The Agent's Job
1. Review privacy policy documents (single or multi-document)
2. Map data practices to stated purposes
3. Identify contradictions, missing clauses, and dark patterns
4. Report compliance violations with severity levels
---
## Tasks & Grading
| Task | Difficulty | Description | Hidden Issues |
|------|------------|-------------|---------------|
| `easy_clause_existence` | Easy | Verify mandatory GDPR clauses are present | 2 |
| `medium_purpose_mapping` | Medium | Match practices to purposes, find mismatches | 3 |
| `hard_dark_patterns` | Hard | Find contradictions within a single document | 5 |
| `elite_multi_doc_reasoning` | Elite | Cross-document contradiction detection | 6 |
### Reward Function
```
R = base_score + severity_bonus + multi_doc_bonus + exploration_bonus
```
- **Base Score**: `issues_found / total_issues`
- **Severity Bonus**: +0.25 for critical findings, +0.15 for high
- **Multi-Document Bonus**: +0.2 for elite task (cross-doc findings)
- **Exploration Bonus**: +0.02 per step (max 0.1)
All rewards are clamped to `[0.0, 1.0]`.
---
## API Endpoints
| Endpoint | Method | Description |
|----------|--------|-------------|
| `/health` | GET | Health check β `{"status": "ok"}` |
| `/reset?task=easy` | GET | Reset environment for a task |
| `/step` | POST | Submit a finding β `{"message": "..."}` |
| `/state` | GET | Get current episode state |
### Example Usage
```bash
# Reset environment
curl "http://localhost:7860/reset?task=easy"
# Submit a compliance finding
curl -X POST "http://localhost:7860/step" \
-H "Content-Type: application/json" \
-d '{"message": "Missing Right to be Forgotten clause"}'
# Get current state
curl "http://localhost:7860/state"
```
---
## Action / Observation Spaces
### Observation (returned by reset/step)
```json
{
"task_id": "easy_clause_existence",
"task_name": "Clause Existence Check",
"difficulty": "easy",
"step": 0,
"documents": [{"id": "...", "title": "...", "content": "...", "doc_type": "policy"}],
"data_practices": [{"id": "...", "category": "...", "purpose": "...", "data_type": "...", "shared_with_third_parties": false}],
"compliance_requirements": ["Right to be Forgotten", "Data Portability", "Contact Information"],
"flagged_issues": [],
"echoed_message": "Review the privacy policy..."
}
```
### Action (sent to /step)
```json
{"message": "Missing Right to be Forgotten clause"}
```
### Reward (returned from /step)
```json
{
"value": 0.52,
"reason": "Found 1/2 issues",
"issues_found": 1,
"total_issues": 2
}
```
---
## Setup & Local Development
### Prerequisites
- Python 3.10+
- `uv` or `pip`
### Install & Run
```bash
# Install dependencies
pip install -e .
# Start the server
python main.py
# β Server at http://localhost:7860
```
### Run Inference
```bash
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
export HF_TOKEN="your-token-here"
export SERVER_URL="http://localhost:7860"
python inference.py
```
### Docker
```bash
docker build -t gdpr-auditor .
docker run -p 7860:7860 gdpr-auditor
```
---
## Project Structure
```
βββ models.py # Pydantic typed models (Observation, Action, Reward)
βββ env/
β βββ __init__.py
β βββ core.py # GDPRAuditorEnvironment with 4 tasks + graders
βββ main.py # FastAPI server with all endpoints
βββ inference.py # Baseline inference script (OpenAI client)
βββ openenv.yaml # OpenEnv manifest with task definitions
βββ pyproject.toml # Dependencies
βββ Dockerfile # Container configuration
βββ README.md # This file
```
---
## Environment Variables
Copy `.env.example` to `.env` and add your Hugging Face token:
```bash
cp .env.example .env
# Then edit .env with your HF_TOKEN
```
| Variable | Description | Default |
|----------|-------------|---------|
| `API_BASE_URL` | LLM API endpoint | `https://router.huggingface.co/v1` |
| `MODEL_NAME` | Model identifier | `Qwen/Qwen2.5-72B-Instruct` |
| `HF_TOKEN` | Hugging Face / API key | (required) |
| `SERVER_URL` | Environment server URL | `http://localhost:7860` |
---
## License
MIT
|