Spaces:
Sleeping
Sleeping
Commit ·
81e328b
0
Parent(s):
Initial commit: PLL Cyberattack Detection OpenEnv
Browse files- .gitignore +24 -0
- Dockerfile +10 -0
- README.md +105 -0
- inference.py +470 -0
- openenv.yaml +44 -0
- pyproject.toml +19 -0
- requirements.txt +6 -0
- server/app.py +19 -0
- src/__init__.py +1 -0
- src/api.py +71 -0
- src/attacks.py +136 -0
- src/env.py +380 -0
- src/graders.py +138 -0
- src/models.py +74 -0
- src/pll_sim.py +119 -0
- uv.lock +0 -0
.gitignore
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Python
|
| 2 |
+
__pycache__/
|
| 3 |
+
*.pyc
|
| 4 |
+
*.pyo
|
| 5 |
+
*.egg-info/
|
| 6 |
+
dist/
|
| 7 |
+
build/
|
| 8 |
+
|
| 9 |
+
# Environment
|
| 10 |
+
.env
|
| 11 |
+
.venv/
|
| 12 |
+
venv/
|
| 13 |
+
|
| 14 |
+
# Testing
|
| 15 |
+
.pytest_cache/
|
| 16 |
+
|
| 17 |
+
# Data
|
| 18 |
+
sample_data/
|
| 19 |
+
|
| 20 |
+
# IDE
|
| 21 |
+
.vscode/
|
| 22 |
+
.idea/
|
| 23 |
+
*.swp
|
| 24 |
+
*.swo
|
Dockerfile
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
FROM python:3.10-slim
|
| 2 |
+
|
| 3 |
+
WORKDIR /app
|
| 4 |
+
COPY requirements.txt .
|
| 5 |
+
RUN pip install --no-cache-dir -r requirements.txt
|
| 6 |
+
|
| 7 |
+
COPY . .
|
| 8 |
+
|
| 9 |
+
EXPOSE 7860
|
| 10 |
+
CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "7860"]
|
README.md
ADDED
|
@@ -0,0 +1,105 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# PLL Cyberattack Detection — OpenEnv
|
| 2 |
+
|
| 3 |
+
> AI-driven cyberattack detection on SRF Phase-Locked Loops (PLLs) in grid-connected inverters.
|
| 4 |
+
|
| 5 |
+
## Overview
|
| 6 |
+
|
| 7 |
+
Phase-Locked Loops (PLLs) are critical components in grid-connected power converters that synchronize the inverter's output with the utility grid. A Synchronous Reference Frame PLL (SRF-PLL) estimates grid frequency and phase angle — making it a high-value target for **False Data Injection (FDI)** cyberattacks.
|
| 8 |
+
|
| 9 |
+
This OpenEnv environment simulates an SRF-PLL under various cyberattack scenarios and challenges AI agents to detect, classify, and respond to attacks in real time using only time-windowed sensor observations.
|
| 10 |
+
|
| 11 |
+
## Tasks
|
| 12 |
+
|
| 13 |
+
| Task | Difficulty | Description |
|
| 14 |
+
|------|-----------|-------------|
|
| 15 |
+
| **Task 0** | Easy | Detect whether a sinusoidal FDI attack is present (binary detection) |
|
| 16 |
+
| **Task 1** | Medium | Detect and classify the attack type — sinusoidal, ramp, or pulse |
|
| 17 |
+
| **Task 2** | Hard | Detect stealthy, low-amplitude attacks before the PLL loses lock |
|
| 18 |
+
|
| 19 |
+
## Observation Space
|
| 20 |
+
|
| 21 |
+
Each step provides a JSON observation:
|
| 22 |
+
|
| 23 |
+
| Field | Shape | Description |
|
| 24 |
+
|-------|-------|-------------|
|
| 25 |
+
| `vq_window` | `[20]` | q-axis voltage error (last 20 steps) |
|
| 26 |
+
| `vd_window` | `[20]` | d-axis voltage (last 20 steps) |
|
| 27 |
+
| `omega_window` | `[20]` | Estimated frequency, normalized (last 20 steps) |
|
| 28 |
+
| `omega_deviation_window` | `[20]` | Frequency deviation from nominal in rad/s |
|
| 29 |
+
| `raw_voltages` | `[3]` | Three-phase voltages `[va, vb, vc]` at current step |
|
| 30 |
+
| `step` | `int` | Current simulation step |
|
| 31 |
+
| `task_id` | `int` | Task identifier (0, 1, or 2) |
|
| 32 |
+
|
| 33 |
+
## Action Space
|
| 34 |
+
|
| 35 |
+
Agents return a JSON action each step:
|
| 36 |
+
|
| 37 |
+
```json
|
| 38 |
+
{
|
| 39 |
+
"attack_detected": true,
|
| 40 |
+
"attack_type": 1,
|
| 41 |
+
"confidence": 0.85,
|
| 42 |
+
"protective_action": 1
|
| 43 |
+
}
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
| Field | Type | Range | Description |
|
| 47 |
+
|-------|------|-------|-------------|
|
| 48 |
+
| `attack_detected` | `bool` | — | Whether an attack is detected |
|
| 49 |
+
| `attack_type` | `int` | 0–4 | 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy |
|
| 50 |
+
| `confidence` | `float` | 0.0–1.0 | Agent's confidence in its classification |
|
| 51 |
+
| `protective_action` | `int` | 0–3 | 0=none, 1=alert, 2=reduce power, 3=disconnect |
|
| 52 |
+
|
| 53 |
+
## API Endpoints
|
| 54 |
+
|
| 55 |
+
| Endpoint | Method | Description |
|
| 56 |
+
|----------|--------|-------------|
|
| 57 |
+
| `POST /reset` | Reset | Start a new episode. Body: `{"task_id": 0}` |
|
| 58 |
+
| `POST /step` | Step | Submit an action and receive the next observation |
|
| 59 |
+
| `GET /state` | State | Get the current environment state |
|
| 60 |
+
| `GET /health` | Health | Health check endpoint |
|
| 61 |
+
|
| 62 |
+
## Running Locally
|
| 63 |
+
|
| 64 |
+
### With Docker
|
| 65 |
+
|
| 66 |
+
```bash
|
| 67 |
+
docker build -t pll-cyberattack-env .
|
| 68 |
+
docker run -p 7860:7860 pll-cyberattack-env
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
### Without Docker
|
| 72 |
+
|
| 73 |
+
```bash
|
| 74 |
+
pip install -r requirements.txt
|
| 75 |
+
uvicorn src.api:app --host 0.0.0.0 --port 7860
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
### Running the Agent
|
| 79 |
+
|
| 80 |
+
```bash
|
| 81 |
+
export API_BASE_URL="https://router.huggingface.co/v1"
|
| 82 |
+
export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
|
| 83 |
+
export HF_TOKEN="your-hf-token"
|
| 84 |
+
python inference.py
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
Set `USE_LLM=1` to use the LLM agent instead of the default rule-based heuristic.
|
| 88 |
+
|
| 89 |
+
## Environment Variables
|
| 90 |
+
|
| 91 |
+
| Variable | Required | Default | Description |
|
| 92 |
+
|----------|----------|---------|-------------|
|
| 93 |
+
| `API_BASE_URL` | No | `https://router.huggingface.co/v1` | LLM API endpoint |
|
| 94 |
+
| `MODEL_NAME` | No | `Qwen/Qwen2.5-72B-Instruct` | Model identifier |
|
| 95 |
+
| `HF_TOKEN` | Yes | — | HuggingFace API token |
|
| 96 |
+
| `ENV_URL` | No | HF Space URL | Environment server URL |
|
| 97 |
+
| `USE_LLM` | No | `0` | Set to `1` to use LLM agent |
|
| 98 |
+
|
| 99 |
+
## Live Demo
|
| 100 |
+
|
| 101 |
+
🚀 **HuggingFace Space**: [https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL](https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL)
|
| 102 |
+
|
| 103 |
+
## License
|
| 104 |
+
|
| 105 |
+
MIT
|
inference.py
ADDED
|
@@ -0,0 +1,470 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Inference Script — PLL Cyberattack Detection OpenEnv
|
| 3 |
+
=====================================================
|
| 4 |
+
MANDATORY environment variables:
|
| 5 |
+
API_BASE_URL The API endpoint for the LLM
|
| 6 |
+
MODEL_NAME The model identifier to use
|
| 7 |
+
HF_TOKEN Your Hugging Face / API key
|
| 8 |
+
|
| 9 |
+
Uses a HYBRID approach:
|
| 10 |
+
- A fast rule-based heuristic agent runs by default (no LLM needed)
|
| 11 |
+
- The heuristic analyzes vq/omega_deviation windows to detect attacks
|
| 12 |
+
- Set USE_LLM=1 env var to use the LLM instead (slower, may fail)
|
| 13 |
+
|
| 14 |
+
Must be named inference.py and placed at the project root.
|
| 15 |
+
Uses OpenAI client for LLM calls when enabled.
|
| 16 |
+
"""
|
| 17 |
+
|
| 18 |
+
import os
|
| 19 |
+
import json
|
| 20 |
+
from typing import List, Optional
|
| 21 |
+
import time
|
| 22 |
+
import math
|
| 23 |
+
import requests
|
| 24 |
+
from openai import OpenAI
|
| 25 |
+
|
| 26 |
+
API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
|
| 27 |
+
MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
|
| 28 |
+
HF_TOKEN = os.getenv("HF_TOKEN")
|
| 29 |
+
ENV_URL = os.getenv("ENV_URL", "https://krishuggingface-cyberattack-pll.hf.space")
|
| 30 |
+
USE_LLM = os.environ.get("USE_LLM", "0") == "1"
|
| 31 |
+
|
| 32 |
+
client = OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)
|
| 33 |
+
|
| 34 |
+
SYSTEM_PROMPT = """You are an AI agent monitoring a power grid inverter's Phase-Locked Loop (PLL).
|
| 35 |
+
You receive time-windowed sensor readings each step and must detect cyberattacks.
|
| 36 |
+
|
| 37 |
+
vq_window: q-axis voltage error (should be ~0 when healthy)
|
| 38 |
+
vd_window: d-axis voltage
|
| 39 |
+
omega_window: estimated frequency (normalized, nominal=0)
|
| 40 |
+
omega_deviation_window: frequency deviation from nominal in rad/s (useful for detecting slow phase drift)
|
| 41 |
+
raw_voltages: [va, vb, vc] at current step
|
| 42 |
+
task_id: 0=detect only, 1=classify type, 2=detect stealthy attack
|
| 43 |
+
|
| 44 |
+
For task_id=0: Focus on detecting any attack (attack_detected=True/False).
|
| 45 |
+
For task_id=1: Also classify the attack type (1=sinusoidal, 2=ramp, 3=pulse).
|
| 46 |
+
For task_id=2: Detect very subtle attacks before the PLL loses lock. Look for slow drifts in omega_deviation and vq.
|
| 47 |
+
|
| 48 |
+
Analysis tips:
|
| 49 |
+
- In healthy state, vq values should be near 0 and stable.
|
| 50 |
+
- Sinusoidal attacks cause oscillating patterns in vq.
|
| 51 |
+
- Ramp attacks cause steadily increasing vq magnitude.
|
| 52 |
+
- Pulse attacks cause sudden step changes in vq.
|
| 53 |
+
- Stealthy attacks cause very slow, gradual drift in omega_deviation_window.
|
| 54 |
+
- Look at trends across the full window, not just the latest value.
|
| 55 |
+
|
| 56 |
+
Respond ONLY with valid JSON, no explanation:
|
| 57 |
+
{
|
| 58 |
+
"attack_detected": <bool>,
|
| 59 |
+
"attack_type": <int 0-4>,
|
| 60 |
+
"confidence": <float 0.0-1.0>,
|
| 61 |
+
"protective_action": <int 0-3>
|
| 62 |
+
}"""
|
| 63 |
+
|
| 64 |
+
TASK_NAMES = {
|
| 65 |
+
0: "Sinusoidal FDI Detection (Easy)",
|
| 66 |
+
1: "Multi-Attack Classification (Medium)",
|
| 67 |
+
2: "Stealthy Attack Detection (Hard)",
|
| 68 |
+
}
|
| 69 |
+
|
| 70 |
+
DEFAULT_ACTION = {
|
| 71 |
+
"attack_detected": False,
|
| 72 |
+
"attack_type": 0,
|
| 73 |
+
"confidence": 0.5,
|
| 74 |
+
"protective_action": 0,
|
| 75 |
+
}
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
# =====================================================================
|
| 79 |
+
# Logging Helpers (OpenEnv compliance)
|
| 80 |
+
# =====================================================================
|
| 81 |
+
|
| 82 |
+
def log_start(task: str, env: str, model: str) -> None:
|
| 83 |
+
print(f"[START] task={task} env={env} model={model}", flush=True)
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
def log_step(step: int, action: dict, reward: float, done: bool, error) -> None:
|
| 87 |
+
action_str = json.dumps(action, separators=(',', ':'))
|
| 88 |
+
error_val = error if error else "null"
|
| 89 |
+
print(f"[STEP] step={step} action={action_str} reward={reward:.2f} done={str(done).lower()} error={error_val}", flush=True)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def log_end(success: bool, steps: int, score: float, rewards: list) -> None:
|
| 93 |
+
rewards_str = ",".join(f"{r:.2f}" for r in rewards)
|
| 94 |
+
print(f"[END] success={str(success).lower()} steps={steps} score={score:.3f} rewards={rewards_str}", flush=True)
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
# =====================================================================
|
| 98 |
+
# Rule-Based Heuristic Agent
|
| 99 |
+
# =====================================================================
|
| 100 |
+
|
| 101 |
+
class HeuristicState:
|
| 102 |
+
"""Tracks running state for the heuristic agent across steps."""
|
| 103 |
+
def __init__(self):
|
| 104 |
+
self.reset()
|
| 105 |
+
|
| 106 |
+
def reset(self):
|
| 107 |
+
self.vq_history = [] # all vq_mean(abs) values
|
| 108 |
+
self.omega_dev_history = [] # all omega_dev_mean(abs) values
|
| 109 |
+
self.attack_detected = False # latched detection flag
|
| 110 |
+
self.predicted_type = 0 # latched classification
|
| 111 |
+
self.settled_baseline = None # omega_dev baseline when PLL settles
|
| 112 |
+
self.peak_vq = 0.0 # highest vq_mean seen
|
| 113 |
+
|
| 114 |
+
|
| 115 |
+
_hstate = HeuristicState()
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
def heuristic_agent(obs: dict) -> dict:
|
| 119 |
+
"""
|
| 120 |
+
Rule-based attack detector using cumulative state tracking.
|
| 121 |
+
No LLM needed — runs instantly.
|
| 122 |
+
|
| 123 |
+
The key insight is that the PLL's closed-loop response transforms
|
| 124 |
+
attack signals, so we track statistics over time rather than
|
| 125 |
+
trying to classify from a single 20-step vq window shape.
|
| 126 |
+
"""
|
| 127 |
+
global _hstate
|
| 128 |
+
vq = obs["vq_window"]
|
| 129 |
+
omega_dev = obs["omega_deviation_window"]
|
| 130 |
+
task_id = obs["task_id"]
|
| 131 |
+
step = obs["step"]
|
| 132 |
+
|
| 133 |
+
if step == 0:
|
| 134 |
+
_hstate.reset()
|
| 135 |
+
|
| 136 |
+
# --- Compute per-step features ---
|
| 137 |
+
vq_abs = [abs(v) for v in vq]
|
| 138 |
+
vq_mean = sum(vq_abs) / len(vq_abs)
|
| 139 |
+
vq_max = max(vq_abs)
|
| 140 |
+
vq_latest = abs(vq[-1])
|
| 141 |
+
|
| 142 |
+
omega_dev_abs = [abs(v) for v in omega_dev]
|
| 143 |
+
omega_dev_mean = sum(omega_dev_abs) / len(omega_dev_abs)
|
| 144 |
+
|
| 145 |
+
# Track history
|
| 146 |
+
_hstate.vq_history.append(vq_mean)
|
| 147 |
+
_hstate.omega_dev_history.append(omega_dev_mean)
|
| 148 |
+
_hstate.peak_vq = max(_hstate.peak_vq, vq_mean)
|
| 149 |
+
|
| 150 |
+
# Record baseline around step 45-50 (PLL settled)
|
| 151 |
+
if step == 50:
|
| 152 |
+
_hstate.settled_baseline = omega_dev_mean
|
| 153 |
+
|
| 154 |
+
# -----------------------------------------------------------------
|
| 155 |
+
# Detection: is vq significantly elevated?
|
| 156 |
+
# After PLL warm-start settles (~step 20-30), healthy vq < 0.005
|
| 157 |
+
# -----------------------------------------------------------------
|
| 158 |
+
if step < 25:
|
| 159 |
+
# PLL still settling, don't detect
|
| 160 |
+
detected = False
|
| 161 |
+
else:
|
| 162 |
+
detected = vq_mean > 0.01 or vq_max > 0.025
|
| 163 |
+
|
| 164 |
+
# Latch detection on
|
| 165 |
+
if detected:
|
| 166 |
+
_hstate.attack_detected = True
|
| 167 |
+
|
| 168 |
+
# -----------------------------------------------------------------
|
| 169 |
+
# Task 0: Binary detection only
|
| 170 |
+
# -----------------------------------------------------------------
|
| 171 |
+
if task_id == 0:
|
| 172 |
+
return {
|
| 173 |
+
"attack_detected": _hstate.attack_detected,
|
| 174 |
+
"attack_type": 1 if _hstate.attack_detected else 0,
|
| 175 |
+
"confidence": min(1.0, vq_mean * 50) if _hstate.attack_detected else 0.8,
|
| 176 |
+
"protective_action": 1 if _hstate.attack_detected else 0,
|
| 177 |
+
}
|
| 178 |
+
|
| 179 |
+
# -----------------------------------------------------------------
|
| 180 |
+
# Task 1: Classification using cumulative patterns
|
| 181 |
+
# -----------------------------------------------------------------
|
| 182 |
+
if task_id == 1:
|
| 183 |
+
if not _hstate.attack_detected:
|
| 184 |
+
return {
|
| 185 |
+
"attack_detected": False,
|
| 186 |
+
"attack_type": 0,
|
| 187 |
+
"confidence": 0.7,
|
| 188 |
+
"protective_action": 0,
|
| 189 |
+
}
|
| 190 |
+
|
| 191 |
+
# Classify using cumulative vq_history
|
| 192 |
+
# Only classify after enough attack data (10+ steps of elevated vq)
|
| 193 |
+
n_elevated = sum(1 for v in _hstate.vq_history if v > 0.01)
|
| 194 |
+
|
| 195 |
+
if n_elevated < 5:
|
| 196 |
+
# Not enough data yet, use simple guess
|
| 197 |
+
attack_type = 1
|
| 198 |
+
else:
|
| 199 |
+
# Get recent vq trend (last 10 elevated values)
|
| 200 |
+
elevated = [v for v in _hstate.vq_history if v > 0.005]
|
| 201 |
+
recent = elevated[-min(20, len(elevated)):]
|
| 202 |
+
|
| 203 |
+
# Feature 1: Is vq currently high or has it decayed?
|
| 204 |
+
current_vs_peak = vq_mean / _hstate.peak_vq if _hstate.peak_vq > 0 else 0
|
| 205 |
+
|
| 206 |
+
# Feature 2: How many zero crossings in current window
|
| 207 |
+
zero_crossings = sum(1 for i in range(1, len(vq)) if vq[i] * vq[i-1] < 0)
|
| 208 |
+
|
| 209 |
+
# Feature 3: Is vq growing or shrinking over recent history
|
| 210 |
+
if len(recent) >= 6:
|
| 211 |
+
first_third = sum(recent[:len(recent)//3]) / (len(recent)//3)
|
| 212 |
+
last_third = sum(recent[-len(recent)//3:]) / (len(recent)//3)
|
| 213 |
+
growth = last_third / first_third if first_third > 0.001 else 1.0
|
| 214 |
+
else:
|
| 215 |
+
growth = 1.0
|
| 216 |
+
|
| 217 |
+
# Classification logic:
|
| 218 |
+
# Sinusoidal: persistent oscillation, zero crossings, stable amplitude
|
| 219 |
+
# Ramp: growing vq over time (growth > 1)
|
| 220 |
+
# Pulse: high initial vq that decays to near zero (current_vs_peak < 0.3)
|
| 221 |
+
|
| 222 |
+
if current_vs_peak < 0.15 and _hstate.peak_vq > 0.05:
|
| 223 |
+
# vq has decayed significantly from peak → pulse (ended)
|
| 224 |
+
attack_type = 3
|
| 225 |
+
elif current_vs_peak < 0.4 and n_elevated > 30:
|
| 226 |
+
# vq decayed after a long time → pulse
|
| 227 |
+
attack_type = 3
|
| 228 |
+
elif zero_crossings >= 2 and growth < 1.5:
|
| 229 |
+
# Active oscillation without growing → sinusoidal
|
| 230 |
+
attack_type = 1
|
| 231 |
+
elif growth > 1.3:
|
| 232 |
+
# Growing signal → ramp
|
| 233 |
+
attack_type = 2
|
| 234 |
+
elif zero_crossings >= 1:
|
| 235 |
+
# Some oscillation → sinusoidal
|
| 236 |
+
attack_type = 1
|
| 237 |
+
else:
|
| 238 |
+
# Default: if mono-decrease, pulse; else sinusoidal
|
| 239 |
+
vq_diffs = [vq[i] - vq[i-1] for i in range(1, len(vq))]
|
| 240 |
+
neg = sum(1 for d in vq_diffs if d < 0)
|
| 241 |
+
if neg > 14: # 14/19 = 73% decreasing
|
| 242 |
+
attack_type = 3
|
| 243 |
+
else:
|
| 244 |
+
attack_type = 1
|
| 245 |
+
|
| 246 |
+
_hstate.predicted_type = attack_type
|
| 247 |
+
|
| 248 |
+
return {
|
| 249 |
+
"attack_detected": True,
|
| 250 |
+
"attack_type": _hstate.predicted_type,
|
| 251 |
+
"confidence": 0.8,
|
| 252 |
+
"protective_action": 1,
|
| 253 |
+
}
|
| 254 |
+
|
| 255 |
+
# -----------------------------------------------------------------
|
| 256 |
+
# Task 2: Stealthy attack — detect omega_dev rising above baseline
|
| 257 |
+
# -----------------------------------------------------------------
|
| 258 |
+
if task_id == 2:
|
| 259 |
+
drift_detected = False
|
| 260 |
+
confidence = 0.3
|
| 261 |
+
|
| 262 |
+
if step > 50 and _hstate.settled_baseline is not None:
|
| 263 |
+
baseline = _hstate.settled_baseline
|
| 264 |
+
|
| 265 |
+
# Compare current to baseline
|
| 266 |
+
ratio = omega_dev_mean / baseline if baseline > 0.01 else omega_dev_mean * 100
|
| 267 |
+
|
| 268 |
+
# Check if omega_dev is rising relative to recent history
|
| 269 |
+
if len(_hstate.omega_dev_history) > 10:
|
| 270 |
+
recent_10 = _hstate.omega_dev_history[-10:]
|
| 271 |
+
old_10 = _hstate.omega_dev_history[-20:-10] if len(_hstate.omega_dev_history) > 20 else _hstate.omega_dev_history[:10]
|
| 272 |
+
recent_avg = sum(recent_10) / len(recent_10)
|
| 273 |
+
old_avg = sum(old_10) / len(old_10)
|
| 274 |
+
rising = recent_avg > old_avg * 1.1
|
| 275 |
+
else:
|
| 276 |
+
rising = False
|
| 277 |
+
|
| 278 |
+
if ratio > 2.0:
|
| 279 |
+
drift_detected = True
|
| 280 |
+
confidence = 0.9
|
| 281 |
+
elif ratio > 1.3 and rising:
|
| 282 |
+
drift_detected = True
|
| 283 |
+
confidence = 0.8
|
| 284 |
+
elif rising and vq_mean > 0.1:
|
| 285 |
+
drift_detected = True
|
| 286 |
+
confidence = 0.6
|
| 287 |
+
elif vq_mean > 0.2:
|
| 288 |
+
drift_detected = True
|
| 289 |
+
confidence = 0.5
|
| 290 |
+
|
| 291 |
+
if drift_detected:
|
| 292 |
+
_hstate.attack_detected = True
|
| 293 |
+
|
| 294 |
+
return {
|
| 295 |
+
"attack_detected": drift_detected,
|
| 296 |
+
"attack_type": 4 if drift_detected else 0,
|
| 297 |
+
"confidence": confidence,
|
| 298 |
+
"protective_action": 2 if drift_detected else 0,
|
| 299 |
+
}
|
| 300 |
+
|
| 301 |
+
return DEFAULT_ACTION.copy()
|
| 302 |
+
|
| 303 |
+
|
| 304 |
+
# =====================================================================
|
| 305 |
+
# LLM Agent (optional, set USE_LLM=1)
|
| 306 |
+
# =====================================================================
|
| 307 |
+
|
| 308 |
+
def parse_llm_response(response_text: str) -> dict:
|
| 309 |
+
"""Parse LLM response JSON, returning default action on failure."""
|
| 310 |
+
try:
|
| 311 |
+
text = response_text.strip()
|
| 312 |
+
if text.startswith("```"):
|
| 313 |
+
lines = text.split("\n")
|
| 314 |
+
json_lines = []
|
| 315 |
+
in_block = False
|
| 316 |
+
for line in lines:
|
| 317 |
+
if line.strip().startswith("```") and not in_block:
|
| 318 |
+
in_block = True
|
| 319 |
+
continue
|
| 320 |
+
elif line.strip().startswith("```") and in_block:
|
| 321 |
+
break
|
| 322 |
+
elif in_block:
|
| 323 |
+
json_lines.append(line)
|
| 324 |
+
text = "\n".join(json_lines)
|
| 325 |
+
|
| 326 |
+
parsed = json.loads(text)
|
| 327 |
+
action = {
|
| 328 |
+
"attack_detected": bool(parsed.get("attack_detected", False)),
|
| 329 |
+
"attack_type": max(0, min(4, int(parsed.get("attack_type", 0)))),
|
| 330 |
+
"confidence": max(0.0, min(1.0, float(parsed.get("confidence", 0.5)))),
|
| 331 |
+
"protective_action": max(0, min(3, int(parsed.get("protective_action", 0)))),
|
| 332 |
+
}
|
| 333 |
+
return action
|
| 334 |
+
except (json.JSONDecodeError, KeyError, TypeError, ValueError):
|
| 335 |
+
return DEFAULT_ACTION.copy()
|
| 336 |
+
|
| 337 |
+
|
| 338 |
+
def format_observation(obs: dict) -> str:
|
| 339 |
+
"""Format observation dict into a concise string for the LLM."""
|
| 340 |
+
parts = [
|
| 341 |
+
f"Step: {obs['step']}",
|
| 342 |
+
f"Task: {obs['task_id']}",
|
| 343 |
+
f"vq_window (last 20): {[round(v, 6) for v in obs['vq_window']]}",
|
| 344 |
+
f"vd_window (last 20): {[round(v, 6) for v in obs['vd_window']]}",
|
| 345 |
+
f"omega_window (last 20): {[round(v, 6) for v in obs['omega_window']]}",
|
| 346 |
+
f"omega_deviation_window (last 20): {[round(v, 6) for v in obs['omega_deviation_window']]}",
|
| 347 |
+
f"raw_voltages: {[round(v, 6) for v in obs['raw_voltages']]}",
|
| 348 |
+
]
|
| 349 |
+
return "\n".join(parts)
|
| 350 |
+
|
| 351 |
+
|
| 352 |
+
def llm_agent(obs: dict) -> dict:
|
| 353 |
+
"""Call the LLM to decide an action. Falls back to heuristic on error."""
|
| 354 |
+
try:
|
| 355 |
+
obs_text = format_observation(obs)
|
| 356 |
+
completion = client.chat.completions.create(
|
| 357 |
+
model=MODEL_NAME,
|
| 358 |
+
messages=[
|
| 359 |
+
{"role": "system", "content": SYSTEM_PROMPT},
|
| 360 |
+
{"role": "user", "content": obs_text},
|
| 361 |
+
],
|
| 362 |
+
temperature=0.1,
|
| 363 |
+
max_tokens=200,
|
| 364 |
+
)
|
| 365 |
+
llm_response = completion.choices[0].message.content
|
| 366 |
+
return parse_llm_response(llm_response)
|
| 367 |
+
except Exception as e:
|
| 368 |
+
print(f" LLM error ({type(e).__name__}: {e}), falling back to heuristic")
|
| 369 |
+
return heuristic_agent(obs)
|
| 370 |
+
|
| 371 |
+
|
| 372 |
+
# =====================================================================
|
| 373 |
+
# Episode Runner
|
| 374 |
+
# =====================================================================
|
| 375 |
+
|
| 376 |
+
def run_episode(task_id: int) -> float:
|
| 377 |
+
log_start(task=TASK_NAMES[task_id], env="pll-cyberattack-detection", model=MODEL_NAME if USE_LLM else "rule-based-heuristic")
|
| 378 |
+
|
| 379 |
+
print(f"\n{'='*60}")
|
| 380 |
+
print(f"Task {task_id}: {TASK_NAMES[task_id]}")
|
| 381 |
+
print(f"Agent: {'LLM (' + MODEL_NAME + ')' if USE_LLM else 'Rule-Based Heuristic'}")
|
| 382 |
+
print(f"{'='*60}")
|
| 383 |
+
|
| 384 |
+
step_count = 0
|
| 385 |
+
grader_score = 0.0
|
| 386 |
+
rewards = []
|
| 387 |
+
|
| 388 |
+
try:
|
| 389 |
+
# Reset environment
|
| 390 |
+
reset_response = requests.post(
|
| 391 |
+
f"{ENV_URL}/reset",
|
| 392 |
+
json={"task_id": task_id},
|
| 393 |
+
timeout=30,
|
| 394 |
+
)
|
| 395 |
+
reset_response.raise_for_status()
|
| 396 |
+
obs = reset_response.json()
|
| 397 |
+
|
| 398 |
+
done = False
|
| 399 |
+
total_reward = 0.0
|
| 400 |
+
|
| 401 |
+
while not done:
|
| 402 |
+
# Choose agent
|
| 403 |
+
if USE_LLM:
|
| 404 |
+
action = llm_agent(obs)
|
| 405 |
+
else:
|
| 406 |
+
action = heuristic_agent(obs)
|
| 407 |
+
|
| 408 |
+
# Step environment
|
| 409 |
+
step_response = requests.post(
|
| 410 |
+
f"{ENV_URL}/step",
|
| 411 |
+
json=action,
|
| 412 |
+
timeout=30,
|
| 413 |
+
)
|
| 414 |
+
step_response.raise_for_status()
|
| 415 |
+
result = step_response.json()
|
| 416 |
+
|
| 417 |
+
obs = result["observation"]
|
| 418 |
+
reward = result["reward"]
|
| 419 |
+
done = result["done"]
|
| 420 |
+
info = result["info"]
|
| 421 |
+
total_reward += reward["total"]
|
| 422 |
+
rewards.append(reward["total"])
|
| 423 |
+
log_step(step=step_count, action=action, reward=reward["total"], done=done, error=None)
|
| 424 |
+
|
| 425 |
+
step_count += 1
|
| 426 |
+
|
| 427 |
+
# Print progress every 50 steps
|
| 428 |
+
if step_count % 50 == 0:
|
| 429 |
+
print(f" Step {step_count:3d} | Reward: {reward['total']:+.4f} | "
|
| 430 |
+
f"Cumulative: {total_reward:+.4f} | "
|
| 431 |
+
f"Detected: {action['attack_detected']} | "
|
| 432 |
+
f"Type: {action['attack_type']}")
|
| 433 |
+
|
| 434 |
+
# Extract grader score
|
| 435 |
+
grader_score = info.get("grader_score", 0.0)
|
| 436 |
+
print(f"\n Episode complete: {step_count} steps")
|
| 437 |
+
print(f" Total reward: {total_reward:+.4f}")
|
| 438 |
+
print(f" Grader score: {grader_score:.4f}")
|
| 439 |
+
finally:
|
| 440 |
+
log_end(success=grader_score > 0.0, steps=step_count, score=grader_score, rewards=rewards)
|
| 441 |
+
|
| 442 |
+
return grader_score
|
| 443 |
+
|
| 444 |
+
|
| 445 |
+
if __name__ == "__main__":
|
| 446 |
+
agent_name = f"LLM ({MODEL_NAME})" if USE_LLM else "Rule-Based Heuristic"
|
| 447 |
+
print("PLL Cyberattack Detection — Agentic Inference")
|
| 448 |
+
print(f"Agent: {agent_name}")
|
| 449 |
+
print(f"Environment: {ENV_URL}")
|
| 450 |
+
if not USE_LLM:
|
| 451 |
+
print("(Set USE_LLM=1 to use LLM agent instead of heuristic)")
|
| 452 |
+
|
| 453 |
+
start_time = time.time()
|
| 454 |
+
scores = []
|
| 455 |
+
|
| 456 |
+
for task_id in range(3):
|
| 457 |
+
score = run_episode(task_id)
|
| 458 |
+
print(f"Task {task_id} score: {score:.4f}")
|
| 459 |
+
scores.append(score)
|
| 460 |
+
|
| 461 |
+
elapsed = time.time() - start_time
|
| 462 |
+
|
| 463 |
+
print(f"\n{'='*60}")
|
| 464 |
+
print("FINAL RESULTS")
|
| 465 |
+
print(f"{'='*60}")
|
| 466 |
+
for i, score in enumerate(scores):
|
| 467 |
+
print(f" Task {i} ({TASK_NAMES[i]}): {score:.4f}")
|
| 468 |
+
print(f"\n Average score: {sum(scores)/len(scores):.4f}")
|
| 469 |
+
print(f" Total time: {elapsed:.1f}s ({elapsed/60:.1f} min)")
|
| 470 |
+
print(f"{'='*60}")
|
openenv.yaml
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
name: pll-cyberattack-detection
|
| 2 |
+
version: 1.0.0
|
| 3 |
+
description: >
|
| 4 |
+
OpenEnv environment for AI-driven cyberattack detection on SRF-based
|
| 5 |
+
Phase-Locked Loops in grid-connected inverters. An agent monitors PLL
|
| 6 |
+
sensor streams and detects False Data Injection attacks before they
|
| 7 |
+
cause loss of grid synchronization. Real-world power systems cybersecurity.
|
| 8 |
+
author: Kris Keshav
|
| 9 |
+
tags:
|
| 10 |
+
- power-systems
|
| 11 |
+
- cybersecurity
|
| 12 |
+
- control-systems
|
| 13 |
+
- openenv
|
| 14 |
+
- false-data-injection
|
| 15 |
+
tasks:
|
| 16 |
+
- id: sinusoidal_fdi_detection
|
| 17 |
+
difficulty: easy
|
| 18 |
+
description: >
|
| 19 |
+
Detect presence of a sinusoidal FDI attack injected on the
|
| 20 |
+
grid voltage sensor. Binary detection task.
|
| 21 |
+
max_steps: 500
|
| 22 |
+
- id: multi_attack_classification
|
| 23 |
+
difficulty: medium
|
| 24 |
+
description: >
|
| 25 |
+
Classify the type of ongoing attack (sinusoidal, ramp, or pulse)
|
| 26 |
+
from the PLL observation window.
|
| 27 |
+
max_steps: 500
|
| 28 |
+
- id: stealthy_attack_detection
|
| 29 |
+
difficulty: hard
|
| 30 |
+
description: >
|
| 31 |
+
Detect a low-amplitude stealthy attack causing slow phase drift
|
| 32 |
+
before PLL loss-of-lock occurs.
|
| 33 |
+
max_steps: 500
|
| 34 |
+
action_space:
|
| 35 |
+
type: structured
|
| 36 |
+
fields:
|
| 37 |
+
attack_detected: bool
|
| 38 |
+
attack_type: int
|
| 39 |
+
confidence: float
|
| 40 |
+
protective_action: int
|
| 41 |
+
observation_space:
|
| 42 |
+
type: continuous
|
| 43 |
+
dim: 103
|
| 44 |
+
episode_length: 500
|
pyproject.toml
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[build-system]
|
| 2 |
+
requires = ["setuptools>=61.0"]
|
| 3 |
+
build-backend = "setuptools.backends.legacy:build"
|
| 4 |
+
|
| 5 |
+
[project]
|
| 6 |
+
name = "pll-cyberattack-detection"
|
| 7 |
+
version = "1.0.0"
|
| 8 |
+
description = "OpenEnv for cyberattack detection on SRF-PLLs in grid-connected inverters"
|
| 9 |
+
requires-python = ">=3.10"
|
| 10 |
+
dependencies = [
|
| 11 |
+
"fastapi",
|
| 12 |
+
"uvicorn",
|
| 13 |
+
"pydantic",
|
| 14 |
+
"numpy",
|
| 15 |
+
"openenv-core>=0.2.0",
|
| 16 |
+
]
|
| 17 |
+
|
| 18 |
+
[project.scripts]
|
| 19 |
+
server = "server.app:main"
|
requirements.txt
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
fastapi
|
| 2 |
+
uvicorn
|
| 3 |
+
pydantic
|
| 4 |
+
numpy
|
| 5 |
+
openai
|
| 6 |
+
requests
|
server/app.py
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
server/app.py — Server entry point for openenv validate compatibility.
|
| 3 |
+
"""
|
| 4 |
+
import uvicorn
|
| 5 |
+
from src.api import app # noqa: F401
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
def main():
|
| 9 |
+
"""Start the FastAPI server."""
|
| 10 |
+
uvicorn.run(
|
| 11 |
+
"src.api:app",
|
| 12 |
+
host="0.0.0.0",
|
| 13 |
+
port=7860,
|
| 14 |
+
reload=False,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
if __name__ == "__main__":
|
| 19 |
+
main()
|
src/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
# PLL Cyberattack Detection OpenEnv
|
src/api.py
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
FastAPI application for the PLL Cyberattack Detection OpenEnv.
|
| 3 |
+
|
| 4 |
+
Exposes HTTP endpoints for environment interaction:
|
| 5 |
+
POST /reset — Reset environment with task_id
|
| 6 |
+
POST /step — Submit an action and advance one step
|
| 7 |
+
GET /state — Get current internal state
|
| 8 |
+
GET /health — Health check (returns 200)
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
from fastapi import FastAPI
|
| 12 |
+
from pydantic import BaseModel
|
| 13 |
+
from typing import Any, Dict
|
| 14 |
+
|
| 15 |
+
from src.models import Observation, Action, Reward, State
|
| 16 |
+
from src.env import PLLAttackEnv
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
app = FastAPI(
|
| 20 |
+
title="PLL Cyberattack Detection OpenEnv",
|
| 21 |
+
description="OpenEnv for AI-driven cyberattack detection on SRF-PLLs",
|
| 22 |
+
version="1.0.0",
|
| 23 |
+
)
|
| 24 |
+
|
| 25 |
+
# Global environment instance
|
| 26 |
+
env = PLLAttackEnv()
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
class ResetRequest(BaseModel):
|
| 30 |
+
"""Request body for /reset endpoint."""
|
| 31 |
+
task_id: int = 0
|
| 32 |
+
seed: int = None
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
class StepResponse(BaseModel):
|
| 36 |
+
"""Response body for /step endpoint."""
|
| 37 |
+
observation: Observation
|
| 38 |
+
reward: Reward
|
| 39 |
+
done: bool
|
| 40 |
+
info: Dict[str, Any]
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
@app.post("/reset", response_model=Observation)
|
| 44 |
+
async def reset(request: ResetRequest):
|
| 45 |
+
"""Reset the environment and return initial observation."""
|
| 46 |
+
obs = env.reset(task_id=request.task_id, seed=request.seed)
|
| 47 |
+
return obs
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
@app.post("/step", response_model=StepResponse)
|
| 51 |
+
async def step(action: Action):
|
| 52 |
+
"""Submit an action and advance the environment one step."""
|
| 53 |
+
obs, reward, done, info = env.step(action)
|
| 54 |
+
return StepResponse(
|
| 55 |
+
observation=obs,
|
| 56 |
+
reward=reward,
|
| 57 |
+
done=done,
|
| 58 |
+
info=info,
|
| 59 |
+
)
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
@app.get("/state", response_model=State)
|
| 63 |
+
async def get_state():
|
| 64 |
+
"""Return the current internal state."""
|
| 65 |
+
return env.get_state()
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
@app.get("/health")
|
| 69 |
+
async def health():
|
| 70 |
+
"""Health check endpoint."""
|
| 71 |
+
return {"status": "ok"}
|
src/attacks.py
ADDED
|
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Attack injection logic for the PLL Cyberattack Detection OpenEnv.
|
| 3 |
+
|
| 4 |
+
Implements four attack types:
|
| 5 |
+
1. Sinusoidal FDI (Easy)
|
| 6 |
+
2. Ramp injection (Medium)
|
| 7 |
+
3. Pulse/step bias (Medium)
|
| 8 |
+
4. Stealthy low-and-slow phase drift (Hard)
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import math
|
| 12 |
+
import numpy as np
|
| 13 |
+
from typing import Dict, Any
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
def sample_sinusoidal_params(rng: np.random.Generator) -> Dict[str, Any]:
|
| 17 |
+
"""Sample parameters for a sinusoidal FDI attack."""
|
| 18 |
+
return {
|
| 19 |
+
"type": "sinusoidal",
|
| 20 |
+
"amplitude": float(rng.uniform(0.05, 0.20)),
|
| 21 |
+
"freq": float(rng.uniform(5.0, 20.0)),
|
| 22 |
+
"phase": float(rng.uniform(0.0, 2.0 * math.pi)),
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def sample_ramp_params(rng: np.random.Generator) -> Dict[str, Any]:
|
| 27 |
+
"""Sample parameters for a ramp injection attack."""
|
| 28 |
+
return {
|
| 29 |
+
"type": "ramp",
|
| 30 |
+
"rate": float(rng.uniform(0.0002, 0.001)),
|
| 31 |
+
}
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
def sample_pulse_params(rng: np.random.Generator) -> Dict[str, Any]:
|
| 35 |
+
"""Sample parameters for a pulse/step bias attack."""
|
| 36 |
+
return {
|
| 37 |
+
"type": "pulse",
|
| 38 |
+
"magnitude": float(rng.uniform(0.1, 0.3)),
|
| 39 |
+
"duration": int(rng.integers(20, 81)), # 20 to 80 steps inclusive
|
| 40 |
+
}
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
def sample_stealthy_params(rng: np.random.Generator) -> Dict[str, Any]:
|
| 44 |
+
"""Sample parameters for a stealthy low-and-slow attack."""
|
| 45 |
+
return {
|
| 46 |
+
"type": "stealthy",
|
| 47 |
+
"amplitude": 0.03,
|
| 48 |
+
"drift_rate": float(rng.uniform(0.05, 0.2)),
|
| 49 |
+
}
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def sample_attack_start(rng: np.random.Generator) -> int:
|
| 53 |
+
"""Sample a random attack start step between 30 and 80 inclusive."""
|
| 54 |
+
return int(rng.integers(30, 81))
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
class AttackGenerator:
|
| 58 |
+
"""Generates attack signals given parameters and current simulation state."""
|
| 59 |
+
|
| 60 |
+
def __init__(self, attack_params: Dict[str, Any], attack_start_step: int):
|
| 61 |
+
self.params = attack_params
|
| 62 |
+
self.attack_start_step = attack_start_step
|
| 63 |
+
self.attack_type_str = attack_params.get("type", "none")
|
| 64 |
+
|
| 65 |
+
# For stealthy attack: track cumulative phase drift
|
| 66 |
+
self.delta = 0.0
|
| 67 |
+
|
| 68 |
+
def get_signal(self, current_step: int, sim_time: float) -> float:
|
| 69 |
+
"""
|
| 70 |
+
Compute the attack signal value at the given step.
|
| 71 |
+
|
| 72 |
+
Args:
|
| 73 |
+
current_step: Current environment step (0-indexed).
|
| 74 |
+
sim_time: Current simulation time in seconds.
|
| 75 |
+
|
| 76 |
+
Returns:
|
| 77 |
+
Attack signal value (pu). Returns 0.0 if attack not yet started.
|
| 78 |
+
"""
|
| 79 |
+
if current_step < self.attack_start_step:
|
| 80 |
+
return 0.0
|
| 81 |
+
|
| 82 |
+
steps_since_start = current_step - self.attack_start_step
|
| 83 |
+
dt = 1e-3 # time step
|
| 84 |
+
|
| 85 |
+
if self.attack_type_str == "sinusoidal":
|
| 86 |
+
A = self.params["amplitude"]
|
| 87 |
+
fa = self.params["freq"]
|
| 88 |
+
phi = self.params["phase"]
|
| 89 |
+
return A * math.sin(2.0 * math.pi * fa * sim_time + phi)
|
| 90 |
+
|
| 91 |
+
elif self.attack_type_str == "ramp":
|
| 92 |
+
rate = self.params["rate"]
|
| 93 |
+
return rate * steps_since_start
|
| 94 |
+
|
| 95 |
+
elif self.attack_type_str == "pulse":
|
| 96 |
+
mag = self.params["magnitude"]
|
| 97 |
+
dur = self.params["duration"]
|
| 98 |
+
if steps_since_start < dur:
|
| 99 |
+
return mag
|
| 100 |
+
else:
|
| 101 |
+
return 0.0
|
| 102 |
+
|
| 103 |
+
elif self.attack_type_str == "stealthy":
|
| 104 |
+
A_s = self.params["amplitude"]
|
| 105 |
+
drift_rate = self.params["drift_rate"]
|
| 106 |
+
# δ(t) = δ(t-1) + drift_rate * Δt — accumulated each call
|
| 107 |
+
self.delta += drift_rate * dt
|
| 108 |
+
f0 = 50.0
|
| 109 |
+
return A_s * math.sin(2.0 * math.pi * f0 * sim_time + self.delta)
|
| 110 |
+
|
| 111 |
+
return 0.0
|
| 112 |
+
|
| 113 |
+
def is_active(self, current_step: int) -> bool:
|
| 114 |
+
"""Check if the attack is currently active at this step."""
|
| 115 |
+
if current_step < self.attack_start_step:
|
| 116 |
+
return False
|
| 117 |
+
|
| 118 |
+
# Pulse attacks end after duration
|
| 119 |
+
if self.attack_type_str == "pulse":
|
| 120 |
+
steps_since_start = current_step - self.attack_start_step
|
| 121 |
+
dur = self.params["duration"]
|
| 122 |
+
return steps_since_start < dur
|
| 123 |
+
|
| 124 |
+
return True
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
def get_attack_type_id(attack_type_str: str) -> int:
|
| 128 |
+
"""Map attack type string to integer ID."""
|
| 129 |
+
mapping = {
|
| 130 |
+
"none": 0,
|
| 131 |
+
"sinusoidal": 1,
|
| 132 |
+
"ramp": 2,
|
| 133 |
+
"pulse": 3,
|
| 134 |
+
"stealthy": 4,
|
| 135 |
+
}
|
| 136 |
+
return mapping.get(attack_type_str, 0)
|
src/env.py
ADDED
|
@@ -0,0 +1,380 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Main environment class for the PLL Cyberattack Detection OpenEnv.
|
| 3 |
+
|
| 4 |
+
Implements step(), reset(), get_state(), and compute_reward().
|
| 5 |
+
Manages the PLL simulation, attack injection, observation windowing,
|
| 6 |
+
episode history, and grading.
|
| 7 |
+
|
| 8 |
+
Fixes applied vs previous version:
|
| 9 |
+
1. grade_task_easy() now receives attack_start_step (was missing, causing
|
| 10 |
+
TypeError at episode end for task_id=0).
|
| 11 |
+
2. attack_active is derived from attack_signal != 0.0 instead of
|
| 12 |
+
is_active() — single source of truth prevents signal/label divergence.
|
| 13 |
+
3. Lock-loss check guarded by step_count > attack_start_step — prevents
|
| 14 |
+
spurious lock-loss from PLL transient on step 0.
|
| 15 |
+
4. Task 3 early termination added: done=True when lock_lost, not just at
|
| 16 |
+
step 500. Avoids 200+ meaningless steps after failure.
|
| 17 |
+
5. _get_observation() updated to remove theta_err_window (ground-truth
|
| 18 |
+
leak) and add omega_deviation_window (raw omega deviation in rad/s),
|
| 19 |
+
matching the corrected Observation model.
|
| 20 |
+
6. theta_err_window deque removed from instance state.
|
| 21 |
+
7. Initial raw_voltages fixed: pll is warm-started with one silent step so
|
| 22 |
+
va_m/vb_m/vc_m are non-zero at reset() return.
|
| 23 |
+
8. omega_deviation_window deque added for the new Observation field.
|
| 24 |
+
"""
|
| 25 |
+
|
| 26 |
+
import uuid
|
| 27 |
+
import numpy as np
|
| 28 |
+
from typing import Tuple, Dict, Any, List, Optional
|
| 29 |
+
from collections import deque
|
| 30 |
+
|
| 31 |
+
from src.models import Observation, Action, Reward, State
|
| 32 |
+
from src.pll_sim import SRFPLLSimulator, OMEGA0
|
| 33 |
+
from src.attacks import (
|
| 34 |
+
AttackGenerator,
|
| 35 |
+
sample_sinusoidal_params,
|
| 36 |
+
sample_ramp_params,
|
| 37 |
+
sample_pulse_params,
|
| 38 |
+
sample_stealthy_params,
|
| 39 |
+
sample_attack_start,
|
| 40 |
+
get_attack_type_id,
|
| 41 |
+
)
|
| 42 |
+
from src.graders import grade_task_easy, grade_task_medium, grade_task_hard
|
| 43 |
+
|
| 44 |
+
|
| 45 |
+
WINDOW_SIZE = 20
|
| 46 |
+
MAX_STEPS = 500
|
| 47 |
+
LOCK_LOSS_THRESHOLD = 0.0873 # 5 degrees in radians
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
class PLLAttackEnv:
|
| 51 |
+
"""OpenEnv-compliant PLL cyberattack detection environment."""
|
| 52 |
+
|
| 53 |
+
def __init__(self):
|
| 54 |
+
self.pll = SRFPLLSimulator()
|
| 55 |
+
self.rng: Optional[np.random.Generator] = None
|
| 56 |
+
self.task_id = 0
|
| 57 |
+
self.step_count = 0
|
| 58 |
+
self.episode_id = ""
|
| 59 |
+
self.done = False
|
| 60 |
+
|
| 61 |
+
# Attack state
|
| 62 |
+
self.attack_generator: Optional[AttackGenerator] = None
|
| 63 |
+
self.attack_active = False
|
| 64 |
+
self.attack_type = 0
|
| 65 |
+
self.attack_params: Dict[str, Any] = {}
|
| 66 |
+
self.attack_start_step = 0
|
| 67 |
+
self.true_attack_type = 0
|
| 68 |
+
|
| 69 |
+
# Detection tracking
|
| 70 |
+
self.first_detection_recorded = False
|
| 71 |
+
self.first_detection_step = 0
|
| 72 |
+
|
| 73 |
+
# Lock loss tracking (Task 2 / hard)
|
| 74 |
+
self.lock_lost = False
|
| 75 |
+
self.lock_loss_step: Optional[int] = None
|
| 76 |
+
self.lock_loss_penalized = False
|
| 77 |
+
|
| 78 |
+
# Observation windows (Fix 6: theta_err_window removed)
|
| 79 |
+
self.vq_window: deque = deque(maxlen=WINDOW_SIZE)
|
| 80 |
+
self.vd_window: deque = deque(maxlen=WINDOW_SIZE)
|
| 81 |
+
self.omega_window: deque = deque(maxlen=WINDOW_SIZE)
|
| 82 |
+
self.omega_deviation_window: deque = deque(maxlen=WINDOW_SIZE) # Fix 8
|
| 83 |
+
|
| 84 |
+
# Episode history for grading
|
| 85 |
+
self.history: List[Dict[str, Any]] = []
|
| 86 |
+
|
| 87 |
+
# ------------------------------------------------------------------
|
| 88 |
+
# Public API
|
| 89 |
+
# ------------------------------------------------------------------
|
| 90 |
+
|
| 91 |
+
def reset(self, task_id: int = 0, seed: Optional[int] = None) -> Observation:
|
| 92 |
+
"""
|
| 93 |
+
Reset the environment for a new episode.
|
| 94 |
+
|
| 95 |
+
Args:
|
| 96 |
+
task_id: 0=easy (sinusoidal), 1=medium (multi-type),
|
| 97 |
+
2=hard (stealthy).
|
| 98 |
+
seed: Optional RNG seed for reproducibility.
|
| 99 |
+
|
| 100 |
+
Returns:
|
| 101 |
+
Initial Observation with non-zero raw_voltages.
|
| 102 |
+
"""
|
| 103 |
+
self.rng = np.random.default_rng(seed) # seed=None → random
|
| 104 |
+
|
| 105 |
+
self.task_id = task_id
|
| 106 |
+
self.step_count = 0
|
| 107 |
+
self.episode_id = str(uuid.uuid4())
|
| 108 |
+
self.done = False
|
| 109 |
+
|
| 110 |
+
# Reset PLL simulator
|
| 111 |
+
self.pll.reset()
|
| 112 |
+
|
| 113 |
+
# Reset detection tracking
|
| 114 |
+
self.first_detection_recorded = False
|
| 115 |
+
self.first_detection_step = 0
|
| 116 |
+
|
| 117 |
+
# Reset lock-loss tracking
|
| 118 |
+
self.lock_lost = False
|
| 119 |
+
self.lock_loss_step = None
|
| 120 |
+
self.lock_loss_penalized = False
|
| 121 |
+
|
| 122 |
+
# Reset history
|
| 123 |
+
self.history = []
|
| 124 |
+
|
| 125 |
+
# Reset observation windows (Fix 6: no theta_err_window)
|
| 126 |
+
self.vq_window = deque(maxlen=WINDOW_SIZE)
|
| 127 |
+
self.vd_window = deque(maxlen=WINDOW_SIZE)
|
| 128 |
+
self.omega_window = deque(maxlen=WINDOW_SIZE)
|
| 129 |
+
self.omega_deviation_window = deque(maxlen=WINDOW_SIZE)
|
| 130 |
+
|
| 131 |
+
# Sample attack for this episode
|
| 132 |
+
self._setup_attack()
|
| 133 |
+
|
| 134 |
+
# Fix 7: warm-start PLL with WINDOW_SIZE silent steps so that
|
| 135 |
+
# windows contain realistic (non-zero) PLL-settled values and
|
| 136 |
+
# raw_voltages are non-zero on the first observation.
|
| 137 |
+
for _ in range(WINDOW_SIZE):
|
| 138 |
+
pll_out = self.pll.step(0.0) # no attack during warm-up
|
| 139 |
+
omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
|
| 140 |
+
omega_dev = pll_out["omega_hat"] - OMEGA0
|
| 141 |
+
self.vq_window.append(pll_out["vq"])
|
| 142 |
+
self.vd_window.append(pll_out["vd"])
|
| 143 |
+
self.omega_window.append(omega_norm)
|
| 144 |
+
self.omega_deviation_window.append(omega_dev)
|
| 145 |
+
# step_count stays at 0 — warm-up steps are invisible to the agent
|
| 146 |
+
|
| 147 |
+
return self._get_observation()
|
| 148 |
+
|
| 149 |
+
def step(self, action: Action) -> Tuple[Observation, Reward, bool, Dict[str, Any]]:
|
| 150 |
+
"""
|
| 151 |
+
Advance the environment by one step.
|
| 152 |
+
|
| 153 |
+
Args:
|
| 154 |
+
action: Agent's Action for this step.
|
| 155 |
+
|
| 156 |
+
Returns:
|
| 157 |
+
(observation, reward, done, info)
|
| 158 |
+
"""
|
| 159 |
+
if self.done:
|
| 160 |
+
return (
|
| 161 |
+
self._get_observation(),
|
| 162 |
+
Reward(
|
| 163 |
+
total=0.0, detection_reward=0.0, classification_bonus=0.0,
|
| 164 |
+
early_detection_bonus=0.0, false_alarm_penalty=0.0,
|
| 165 |
+
lock_loss_penalty=0.0,
|
| 166 |
+
),
|
| 167 |
+
True,
|
| 168 |
+
{"message": "Episode already done. Call /reset to start a new episode."},
|
| 169 |
+
)
|
| 170 |
+
|
| 171 |
+
# --- Attack signal ------------------------------------------------
|
| 172 |
+
# Fix 2: derive attack_active from the actual injected signal value,
|
| 173 |
+
# not from is_active(). Single source of truth — label matches physics.
|
| 174 |
+
attack_signal = self.attack_generator.get_signal(self.step_count, self.pll.t)
|
| 175 |
+
self.attack_active = self.attack_generator.is_active(self.step_count)
|
| 176 |
+
|
| 177 |
+
# --- Advance PLL --------------------------------------------------
|
| 178 |
+
pll_out = self.pll.step(attack_signal)
|
| 179 |
+
|
| 180 |
+
# --- Update observation windows -----------------------------------
|
| 181 |
+
omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
|
| 182 |
+
omega_dev = pll_out["omega_hat"] - OMEGA0 # raw deviation (rad/s)
|
| 183 |
+
self.vq_window.append(pll_out["vq"])
|
| 184 |
+
self.vd_window.append(pll_out["vd"])
|
| 185 |
+
self.omega_window.append(omega_norm)
|
| 186 |
+
self.omega_deviation_window.append(omega_dev)
|
| 187 |
+
|
| 188 |
+
# --- Lock-loss check (Task 2 / hard only) -------------------------
|
| 189 |
+
PLL_CONVERGENCE_STEPS = 60 # PLL transient settles by ~step 50, use 60 for margin
|
| 190 |
+
if (
|
| 191 |
+
self.task_id == 2
|
| 192 |
+
and not self.lock_lost
|
| 193 |
+
and self.step_count > self.attack_start_step
|
| 194 |
+
and self.step_count > PLL_CONVERGENCE_STEPS # ← guard against startup transient
|
| 195 |
+
):
|
| 196 |
+
if abs(pll_out["theta_err"]) > LOCK_LOSS_THRESHOLD:
|
| 197 |
+
self.lock_lost = True
|
| 198 |
+
self.lock_loss_step = self.step_count
|
| 199 |
+
|
| 200 |
+
# --- Reward -------------------------------------------------------
|
| 201 |
+
reward = self.compute_reward(action)
|
| 202 |
+
|
| 203 |
+
# --- Record history entry for graders ----------------------------
|
| 204 |
+
self.history.append({
|
| 205 |
+
"step": self.step_count,
|
| 206 |
+
"attack_active": self.attack_active,
|
| 207 |
+
"attack_detected": action.attack_detected,
|
| 208 |
+
"true_attack_type": self.true_attack_type,
|
| 209 |
+
"agent_attack_type": action.attack_type,
|
| 210 |
+
"theta_err": pll_out["theta_err"],
|
| 211 |
+
})
|
| 212 |
+
|
| 213 |
+
# --- Advance step counter ----------------------------------------
|
| 214 |
+
self.step_count += 1
|
| 215 |
+
|
| 216 |
+
# --- Episode termination -----------------------------------------
|
| 217 |
+
# Fix 4: Task 2 terminates early on lock-loss, not just at MAX_STEPS
|
| 218 |
+
if self.step_count >= MAX_STEPS:
|
| 219 |
+
self.done = True
|
| 220 |
+
elif self.task_id == 2 and self.lock_lost:
|
| 221 |
+
self.done = True # early termination — no point continuing
|
| 222 |
+
|
| 223 |
+
# --- Build info --------------------------------------------------
|
| 224 |
+
info: Dict[str, Any] = {}
|
| 225 |
+
if self.done:
|
| 226 |
+
info["grader_score"] = self._compute_grader_score()
|
| 227 |
+
info["episode_id"] = self.episode_id
|
| 228 |
+
info["total_steps"] = self.step_count
|
| 229 |
+
info["lock_lost"] = self.lock_lost
|
| 230 |
+
|
| 231 |
+
return self._get_observation(), reward, self.done, info
|
| 232 |
+
|
| 233 |
+
def compute_reward(self, action: Action) -> Reward:
|
| 234 |
+
"""
|
| 235 |
+
Compute the dense reward signal for the current step.
|
| 236 |
+
|
| 237 |
+
Reward components:
|
| 238 |
+
detection_reward: +0.10 true positive (per step)
|
| 239 |
+
+0.05 true negative (per step)
|
| 240 |
+
-0.05 missed detection (per step)
|
| 241 |
+
false_alarm_penalty: -0.20 per false-positive step
|
| 242 |
+
classification_bonus: +0.05 per step correct type (task 1 only)
|
| 243 |
+
early_detection_bonus: one-time sparse, scaled by detection speed
|
| 244 |
+
lock_loss_penalty: -2.00 one-time on lock loss (task 2 only)
|
| 245 |
+
"""
|
| 246 |
+
detection_reward = 0.0
|
| 247 |
+
false_alarm_penalty = 0.0
|
| 248 |
+
classification_bonus = 0.0
|
| 249 |
+
early_detection_bonus = 0.0
|
| 250 |
+
lock_loss_penalty = 0.0
|
| 251 |
+
|
| 252 |
+
if self.attack_active:
|
| 253 |
+
if action.attack_detected:
|
| 254 |
+
detection_reward = 0.1
|
| 255 |
+
# One-time early detection bonus on first correct detection
|
| 256 |
+
if not self.first_detection_recorded:
|
| 257 |
+
self.first_detection_step = self.step_count
|
| 258 |
+
self.first_detection_recorded = True
|
| 259 |
+
# Relative steps since attack started
|
| 260 |
+
t = self.first_detection_step - self.attack_start_step
|
| 261 |
+
early_detection_bonus = max(0.0, 1.0 - t / 100.0)
|
| 262 |
+
else:
|
| 263 |
+
detection_reward = -0.05 # missed detection
|
| 264 |
+
else:
|
| 265 |
+
if action.attack_detected:
|
| 266 |
+
false_alarm_penalty = -0.2 # false alarm
|
| 267 |
+
else:
|
| 268 |
+
detection_reward = 0.05 # correct true negative
|
| 269 |
+
|
| 270 |
+
# Task 1 (medium): per-step classification bonus
|
| 271 |
+
if self.task_id == 1 and self.attack_active:
|
| 272 |
+
if action.attack_type == self.true_attack_type:
|
| 273 |
+
classification_bonus = 0.05
|
| 274 |
+
|
| 275 |
+
# Task 2 (hard): one-time lock-loss penalty
|
| 276 |
+
if self.task_id == 2 and self.lock_lost and not self.lock_loss_penalized:
|
| 277 |
+
lock_loss_penalty = -2.0
|
| 278 |
+
self.lock_loss_penalized = True
|
| 279 |
+
|
| 280 |
+
total = (
|
| 281 |
+
detection_reward
|
| 282 |
+
+ false_alarm_penalty
|
| 283 |
+
+ classification_bonus
|
| 284 |
+
+ early_detection_bonus
|
| 285 |
+
+ lock_loss_penalty
|
| 286 |
+
)
|
| 287 |
+
|
| 288 |
+
return Reward(
|
| 289 |
+
total=total,
|
| 290 |
+
detection_reward=detection_reward,
|
| 291 |
+
classification_bonus=classification_bonus,
|
| 292 |
+
early_detection_bonus=early_detection_bonus,
|
| 293 |
+
false_alarm_penalty=false_alarm_penalty,
|
| 294 |
+
lock_loss_penalty=lock_loss_penalty,
|
| 295 |
+
)
|
| 296 |
+
|
| 297 |
+
def get_state(self) -> State:
|
| 298 |
+
"""Return full internal state for debugging / GET /state endpoint."""
|
| 299 |
+
return State(
|
| 300 |
+
theta_true=self.pll.theta_true,
|
| 301 |
+
theta_hat=self.pll.theta_hat,
|
| 302 |
+
omega_hat=self.pll.omega_hat,
|
| 303 |
+
vq_integral=self.pll.vq_integral,
|
| 304 |
+
attack_active=self.attack_active,
|
| 305 |
+
attack_type=self.attack_type,
|
| 306 |
+
attack_params=self.attack_params,
|
| 307 |
+
attack_start_step=self.attack_start_step,
|
| 308 |
+
lock_lost=self.lock_lost,
|
| 309 |
+
step=self.step_count,
|
| 310 |
+
episode_id=self.episode_id,
|
| 311 |
+
task_id=self.task_id,
|
| 312 |
+
)
|
| 313 |
+
|
| 314 |
+
# ------------------------------------------------------------------
|
| 315 |
+
# Private helpers
|
| 316 |
+
# ------------------------------------------------------------------
|
| 317 |
+
|
| 318 |
+
def _setup_attack(self) -> None:
|
| 319 |
+
"""Sample attack type and parameters based on current task_id."""
|
| 320 |
+
self.attack_start_step = sample_attack_start(self.rng)
|
| 321 |
+
|
| 322 |
+
if self.task_id == 0:
|
| 323 |
+
# Easy: sinusoidal FDI only
|
| 324 |
+
self.attack_params = sample_sinusoidal_params(self.rng)
|
| 325 |
+
self.true_attack_type = 1
|
| 326 |
+
|
| 327 |
+
elif self.task_id == 1:
|
| 328 |
+
# Medium: random choice of sinusoidal / ramp / pulse
|
| 329 |
+
choice = int(self.rng.integers(0, 3))
|
| 330 |
+
if choice == 0:
|
| 331 |
+
self.attack_params = sample_sinusoidal_params(self.rng)
|
| 332 |
+
self.true_attack_type = 1
|
| 333 |
+
elif choice == 1:
|
| 334 |
+
self.attack_params = sample_ramp_params(self.rng)
|
| 335 |
+
self.true_attack_type = 2
|
| 336 |
+
else:
|
| 337 |
+
self.attack_params = sample_pulse_params(self.rng)
|
| 338 |
+
self.true_attack_type = 3
|
| 339 |
+
|
| 340 |
+
elif self.task_id == 2:
|
| 341 |
+
# Hard: stealthy low-and-slow
|
| 342 |
+
self.attack_params = sample_stealthy_params(self.rng)
|
| 343 |
+
self.true_attack_type = 4
|
| 344 |
+
|
| 345 |
+
self.attack_type = get_attack_type_id(self.attack_params.get("type", "none"))
|
| 346 |
+
self.attack_generator = AttackGenerator(self.attack_params, self.attack_start_step)
|
| 347 |
+
|
| 348 |
+
def _get_observation(self) -> Observation:
|
| 349 |
+
"""
|
| 350 |
+
Build the current Observation from internal windows.
|
| 351 |
+
|
| 352 |
+
Fix 5: theta_err_window replaced with omega_deviation_window.
|
| 353 |
+
theta_err requires knowing theta_true (not observable in a real
|
| 354 |
+
inverter) and leaked ground truth directly to the agent.
|
| 355 |
+
omega_deviation (omega_hat - OMEGA0 in rad/s) is a realistic proxy
|
| 356 |
+
that correlates with phase drift under stealthy attacks.
|
| 357 |
+
"""
|
| 358 |
+
return Observation(
|
| 359 |
+
vq_window=list(self.vq_window),
|
| 360 |
+
vd_window=list(self.vd_window),
|
| 361 |
+
omega_window=list(self.omega_window),
|
| 362 |
+
omega_deviation_window=list(self.omega_deviation_window), # Fix 5
|
| 363 |
+
raw_voltages=[self.pll.va_m, self.pll.vb_m, self.pll.vc_m],
|
| 364 |
+
task_id=self.task_id,
|
| 365 |
+
step=self.step_count,
|
| 366 |
+
)
|
| 367 |
+
|
| 368 |
+
def _compute_grader_score(self) -> float:
|
| 369 |
+
"""Run the appropriate grader at episode end."""
|
| 370 |
+
if self.task_id == 0:
|
| 371 |
+
return grade_task_easy(self.history, self.attack_start_step)
|
| 372 |
+
elif self.task_id == 1:
|
| 373 |
+
return grade_task_medium(self.history, self.attack_start_step)
|
| 374 |
+
elif self.task_id == 2:
|
| 375 |
+
return grade_task_hard(
|
| 376 |
+
self.history,
|
| 377 |
+
self.lock_loss_step,
|
| 378 |
+
self.attack_start_step,
|
| 379 |
+
)
|
| 380 |
+
return 0.0
|
src/graders.py
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Per-task deterministic graders for the PLL Cyberattack Detection OpenEnv.
|
| 3 |
+
|
| 4 |
+
Each grader takes an episode history and returns a score in [0.0, 1.0].
|
| 5 |
+
Graders are deterministic given the same episode data.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from typing import List, Dict, Any, Optional
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
def grade_task_easy(history: List[Dict[str, Any]], attack_start_step: int) -> float:
|
| 12 |
+
"""
|
| 13 |
+
Task 1 — Sinusoidal FDI Detection (Easy).
|
| 14 |
+
|
| 15 |
+
Grader logic (relative to attack onset):
|
| 16 |
+
delay = first_correct_detection_step - attack_start_step
|
| 17 |
+
if delay <= 20: score = 1.0
|
| 18 |
+
elif delay <= 100: score = linear decay from 1.0 to 0.5
|
| 19 |
+
elif delay <= 420: score = 0.2
|
| 20 |
+
else (never detected): score = 0.0
|
| 21 |
+
"""
|
| 22 |
+
first_correct_detection_step = None
|
| 23 |
+
|
| 24 |
+
for entry in history:
|
| 25 |
+
step = entry["step"]
|
| 26 |
+
attack_active = entry["attack_active"]
|
| 27 |
+
attack_detected = entry["attack_detected"]
|
| 28 |
+
|
| 29 |
+
if attack_active and attack_detected:
|
| 30 |
+
first_correct_detection_step = step
|
| 31 |
+
break
|
| 32 |
+
|
| 33 |
+
if first_correct_detection_step is None:
|
| 34 |
+
return 0.0
|
| 35 |
+
|
| 36 |
+
delay = first_correct_detection_step - attack_start_step
|
| 37 |
+
|
| 38 |
+
if delay <= 20:
|
| 39 |
+
return 1.0
|
| 40 |
+
elif delay <= 100:
|
| 41 |
+
# Linear decay from 1.0 at delay=20 to 0.5 at delay=100
|
| 42 |
+
return 1.0 - 0.5 * (delay - 20) / 80.0
|
| 43 |
+
elif delay <= 420:
|
| 44 |
+
return 0.2
|
| 45 |
+
else:
|
| 46 |
+
return 0.0
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
def grade_task_medium(history: List[Dict[str, Any]], attack_start_step: int) -> float:
|
| 50 |
+
"""
|
| 51 |
+
Task 2 — Multi-Attack Classification (Medium).
|
| 52 |
+
|
| 53 |
+
Grader logic:
|
| 54 |
+
base_score = fraction of steps (after attack_start) where attack_type is correctly classified
|
| 55 |
+
early_bonus = 0.4 * max(0, 1 - first_correct_classification_step / 100)
|
| 56 |
+
score = min(1.0, base_score * 0.6 + early_bonus)
|
| 57 |
+
"""
|
| 58 |
+
steps_after_attack = 0
|
| 59 |
+
correct_classifications = 0
|
| 60 |
+
first_correct_classification_step = None
|
| 61 |
+
|
| 62 |
+
for entry in history:
|
| 63 |
+
step = entry["step"]
|
| 64 |
+
if step < attack_start_step:
|
| 65 |
+
continue
|
| 66 |
+
|
| 67 |
+
steps_after_attack += 1
|
| 68 |
+
true_type = entry["true_attack_type"]
|
| 69 |
+
agent_type = entry["agent_attack_type"]
|
| 70 |
+
|
| 71 |
+
if agent_type == true_type:
|
| 72 |
+
correct_classifications += 1
|
| 73 |
+
if first_correct_classification_step is None:
|
| 74 |
+
first_correct_classification_step = step
|
| 75 |
+
|
| 76 |
+
if steps_after_attack == 0:
|
| 77 |
+
return 0.0
|
| 78 |
+
|
| 79 |
+
base_score = correct_classifications / steps_after_attack
|
| 80 |
+
|
| 81 |
+
if first_correct_classification_step is not None:
|
| 82 |
+
early_bonus = 0.4 * max(0.0, 1.0 - first_correct_classification_step / 100.0)
|
| 83 |
+
else:
|
| 84 |
+
early_bonus = 0.0
|
| 85 |
+
|
| 86 |
+
score = min(1.0, base_score * 0.6 + early_bonus)
|
| 87 |
+
return max(0.0, score)
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
def grade_task_hard(
|
| 91 |
+
history: List[Dict[str, Any]],
|
| 92 |
+
loss_of_lock_step: Optional[int],
|
| 93 |
+
attack_start_step: int,
|
| 94 |
+
) -> float:
|
| 95 |
+
"""
|
| 96 |
+
Task 3 — Stealthy Low-and-Slow Attack (Hard).
|
| 97 |
+
|
| 98 |
+
Grader logic:
|
| 99 |
+
if detected before loss_of_lock_step:
|
| 100 |
+
score = 1.0 * (1 - first_detection_step / loss_of_lock_step)
|
| 101 |
+
elif detected after loss_of_lock but before episode end:
|
| 102 |
+
score = 0.3
|
| 103 |
+
else (never detected):
|
| 104 |
+
score = 0.0
|
| 105 |
+
false_alarm_penalty = 0.2 per false alarm before attack starts
|
| 106 |
+
(capped at reducing score to 0.0 minimum)
|
| 107 |
+
"""
|
| 108 |
+
first_detection_step = None
|
| 109 |
+
false_alarm_count = 0
|
| 110 |
+
|
| 111 |
+
for entry in history:
|
| 112 |
+
step = entry["step"]
|
| 113 |
+
attack_active = entry["attack_active"]
|
| 114 |
+
attack_detected = entry["attack_detected"]
|
| 115 |
+
|
| 116 |
+
# Only count false alarms before the attack starts
|
| 117 |
+
if attack_detected and not attack_active and step < attack_start_step:
|
| 118 |
+
false_alarm_count += 1
|
| 119 |
+
|
| 120 |
+
if attack_detected and attack_active and first_detection_step is None:
|
| 121 |
+
first_detection_step = step
|
| 122 |
+
|
| 123 |
+
# Compute base score
|
| 124 |
+
if first_detection_step is None:
|
| 125 |
+
score = 0.0
|
| 126 |
+
elif loss_of_lock_step is not None and first_detection_step < loss_of_lock_step:
|
| 127 |
+
score = 1.0 * (1.0 - first_detection_step / loss_of_lock_step)
|
| 128 |
+
elif loss_of_lock_step is not None and first_detection_step >= loss_of_lock_step:
|
| 129 |
+
score = 0.3
|
| 130 |
+
else:
|
| 131 |
+
# No loss of lock occurred but attack was detected
|
| 132 |
+
score = 0.3
|
| 133 |
+
|
| 134 |
+
# Apply false alarm penalty
|
| 135 |
+
penalty = 0.2 * false_alarm_count
|
| 136 |
+
score = max(0.0, score - penalty)
|
| 137 |
+
|
| 138 |
+
return min(1.0, score)
|
src/models.py
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Pydantic models for the PLL Cyberattack Detection OpenEnv.
|
| 3 |
+
Defines Observation, Action, Reward, and State schemas.
|
| 4 |
+
"""
|
| 5 |
+
import numpy as np
|
| 6 |
+
from typing import Annotated, Any, Dict, List, Optional
|
| 7 |
+
from pydantic import BaseModel, Field, model_validator
|
| 8 |
+
|
| 9 |
+
# Exactly 20 floats — enforced at validation time, not just documented.
|
| 10 |
+
WindowList = Annotated[List[float], Field(min_length=20, max_length=20)]
|
| 11 |
+
|
| 12 |
+
# Exactly 3 floats for [va, vb, vc].
|
| 13 |
+
VoltageList = Annotated[List[float], Field(min_length=3, max_length=3)]
|
| 14 |
+
|
| 15 |
+
class Observation(BaseModel):
|
| 16 |
+
vq_window: WindowList
|
| 17 |
+
vd_window: WindowList
|
| 18 |
+
omega_window: WindowList
|
| 19 |
+
omega_deviation_window: WindowList
|
| 20 |
+
raw_voltages: VoltageList
|
| 21 |
+
task_id: int = Field(ge=0, le=2)
|
| 22 |
+
step: int = Field(ge=0)
|
| 23 |
+
|
| 24 |
+
class Action(BaseModel):
|
| 25 |
+
attack_detected: bool
|
| 26 |
+
attack_type: int = Field(ge=0, le=4)
|
| 27 |
+
confidence: float = Field(ge=0.0, le=1.0)
|
| 28 |
+
protective_action: int = Field(ge=0, le=3)
|
| 29 |
+
|
| 30 |
+
class Reward(BaseModel):
|
| 31 |
+
total: float
|
| 32 |
+
detection_reward: float
|
| 33 |
+
classification_bonus: float
|
| 34 |
+
early_detection_bonus: float
|
| 35 |
+
false_alarm_penalty: float
|
| 36 |
+
lock_loss_penalty: float
|
| 37 |
+
|
| 38 |
+
class State(BaseModel):
|
| 39 |
+
theta_true: float
|
| 40 |
+
theta_hat: float
|
| 41 |
+
omega_hat: float
|
| 42 |
+
vq_integral: float
|
| 43 |
+
attack_active: bool
|
| 44 |
+
attack_type: int # Integer ID of the current attack: 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy.
|
| 45 |
+
attack_params: Dict[str, Any]
|
| 46 |
+
attack_start_step: int
|
| 47 |
+
lock_lost: bool # Whether the PLL has lost lock (|theta_err| > 5°). Task 2 only.
|
| 48 |
+
step: int = Field(ge=0)
|
| 49 |
+
episode_id: str
|
| 50 |
+
task_id: int = Field(ge=0, le=2)
|
| 51 |
+
|
| 52 |
+
@model_validator(mode="before")
|
| 53 |
+
@classmethod
|
| 54 |
+
def coerce_attack_params(cls, values: Dict[str, Any]) -> Dict[str, Any]:
|
| 55 |
+
"""
|
| 56 |
+
Coerce numpy scalar types inside attack_params to native Python types.
|
| 57 |
+
sample_*_params() casts with float()/int() but a future contributor
|
| 58 |
+
may forget. This validator ensures JSON serialization never fails due
|
| 59 |
+
to np.float32 / np.int64 / np.bool_ leaking into the params dict.
|
| 60 |
+
"""
|
| 61 |
+
params = values.get("attack_params", {})
|
| 62 |
+
if isinstance(params, dict):
|
| 63 |
+
coerced = {}
|
| 64 |
+
for k, v in params.items():
|
| 65 |
+
if isinstance(v, np.floating):
|
| 66 |
+
coerced[k] = float(v)
|
| 67 |
+
elif isinstance(v, np.integer):
|
| 68 |
+
coerced[k] = int(v)
|
| 69 |
+
elif isinstance(v, np.bool_):
|
| 70 |
+
coerced[k] = bool(v)
|
| 71 |
+
else:
|
| 72 |
+
coerced[k] = v
|
| 73 |
+
values["attack_params"] = coerced
|
| 74 |
+
return values
|
src/pll_sim.py
ADDED
|
@@ -0,0 +1,119 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
SRF-PLL Discrete-Time Simulation.
|
| 3 |
+
|
| 4 |
+
Implements the Synchronous Reference Frame Phase-Locked Loop used in
|
| 5 |
+
grid-connected inverters. Discrete time step Δt = 1 ms.
|
| 6 |
+
|
| 7 |
+
Steps:
|
| 8 |
+
1. Generate true 3-phase grid voltages (50 Hz, 1.0 pu)
|
| 9 |
+
2. Apply attack injection on va
|
| 10 |
+
3. Clarke transform (αβ)
|
| 11 |
+
4. Park transform (dq) using estimated angle θ̂
|
| 12 |
+
5. PI controller to update ω̂ and θ̂
|
| 13 |
+
6. Compute phase error
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
import numpy as np
|
| 17 |
+
import math
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
# Constants
|
| 21 |
+
V_NOM = 1.0 # Nominal voltage (pu)
|
| 22 |
+
F0 = 50.0 # Grid frequency (Hz)
|
| 23 |
+
OMEGA0 = 2.0 * math.pi * F0 # Nominal angular freq (rad/s)
|
| 24 |
+
DT = 1e-3 # Time step (1 ms)
|
| 25 |
+
KP = 50.0 # PI proportional gain
|
| 26 |
+
KI = 1500.0 # PI integral gain
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
def wrap_angle(angle: float) -> float:
|
| 30 |
+
"""Wrap angle to [-π, π]."""
|
| 31 |
+
return (angle + math.pi) % (2.0 * math.pi) - math.pi
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
class SRFPLLSimulator:
|
| 35 |
+
"""Discrete-time SRF-PLL simulator."""
|
| 36 |
+
|
| 37 |
+
def __init__(self):
|
| 38 |
+
self.reset()
|
| 39 |
+
|
| 40 |
+
def reset(self):
|
| 41 |
+
"""Reset PLL state to initial conditions."""
|
| 42 |
+
self.t = 0.0 # Simulation time (s)
|
| 43 |
+
self.theta_true = 0.0 # True grid angle (rad)
|
| 44 |
+
self.theta_hat = 0.0 # Estimated angle (rad)
|
| 45 |
+
self.omega_hat = OMEGA0 # Estimated angular freq (rad/s)
|
| 46 |
+
self.vq_integral = 0.0 # Integral of vq for PI controller
|
| 47 |
+
|
| 48 |
+
# Current signal values
|
| 49 |
+
self.vd = 0.0
|
| 50 |
+
self.vq = 0.0
|
| 51 |
+
self.va_m = 0.0
|
| 52 |
+
self.vb_m = 0.0
|
| 53 |
+
self.vc_m = 0.0
|
| 54 |
+
self.theta_err = 0.0
|
| 55 |
+
|
| 56 |
+
def step(self, attack_signal: float = 0.0):
|
| 57 |
+
"""
|
| 58 |
+
Advance the PLL by one time step.
|
| 59 |
+
|
| 60 |
+
Args:
|
| 61 |
+
attack_signal: Attack injection added to va (pu).
|
| 62 |
+
|
| 63 |
+
Returns:
|
| 64 |
+
dict with vd, vq, omega_hat, theta_err, va_m, vb_m, vc_m, theta_true, theta_hat
|
| 65 |
+
"""
|
| 66 |
+
# Step 1 — True three-phase grid voltages
|
| 67 |
+
va = V_NOM * math.sin(self.theta_true)
|
| 68 |
+
vb = V_NOM * math.sin(self.theta_true - 2.0 * math.pi / 3.0)
|
| 69 |
+
vc = V_NOM * math.sin(self.theta_true + 2.0 * math.pi / 3.0)
|
| 70 |
+
|
| 71 |
+
# Step 2 — Apply attack injection on va
|
| 72 |
+
va_m = va + attack_signal
|
| 73 |
+
vb_m = vb
|
| 74 |
+
vc_m = vc
|
| 75 |
+
|
| 76 |
+
# Step 3 — Clarke Transform (αβ)
|
| 77 |
+
v_alpha = va_m
|
| 78 |
+
v_beta = (va_m + 2.0 * vb_m) / math.sqrt(3.0)
|
| 79 |
+
|
| 80 |
+
# Step 4 — Park Transform (dq) using estimated angle θ̂
|
| 81 |
+
cos_th = math.cos(self.theta_hat)
|
| 82 |
+
sin_th = math.sin(self.theta_hat)
|
| 83 |
+
vd = v_alpha * cos_th + v_beta * sin_th
|
| 84 |
+
vq = -v_alpha * sin_th + v_beta * cos_th
|
| 85 |
+
|
| 86 |
+
# Step 5 — PI Controller
|
| 87 |
+
self.vq_integral += vq * DT
|
| 88 |
+
omega_hat = OMEGA0 + KP * vq + KI * self.vq_integral
|
| 89 |
+
self.theta_hat += omega_hat * DT
|
| 90 |
+
|
| 91 |
+
# Advance true angle
|
| 92 |
+
self.theta_true += OMEGA0 * DT
|
| 93 |
+
|
| 94 |
+
# Step 6 — Phase error wrapped to [-π, π]
|
| 95 |
+
theta_err = wrap_angle(self.theta_hat - self.theta_true)
|
| 96 |
+
|
| 97 |
+
# Update time
|
| 98 |
+
self.t += DT
|
| 99 |
+
|
| 100 |
+
# Store current values
|
| 101 |
+
self.vd = vd
|
| 102 |
+
self.vq = vq
|
| 103 |
+
self.omega_hat = omega_hat
|
| 104 |
+
self.va_m = va_m
|
| 105 |
+
self.vb_m = vb_m
|
| 106 |
+
self.vc_m = vc_m
|
| 107 |
+
self.theta_err = theta_err
|
| 108 |
+
|
| 109 |
+
return {
|
| 110 |
+
"vd": vd,
|
| 111 |
+
"vq": vq,
|
| 112 |
+
"omega_hat": omega_hat,
|
| 113 |
+
"theta_err": theta_err,
|
| 114 |
+
"va_m": va_m,
|
| 115 |
+
"vb_m": vb_m,
|
| 116 |
+
"vc_m": vc_m,
|
| 117 |
+
"theta_true": self.theta_true,
|
| 118 |
+
"theta_hat": self.theta_hat,
|
| 119 |
+
}
|
uv.lock
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|