krishuggingface commited on
Commit
81e328b
·
0 Parent(s):

Initial commit: PLL Cyberattack Detection OpenEnv

Browse files
Files changed (16) hide show
  1. .gitignore +24 -0
  2. Dockerfile +10 -0
  3. README.md +105 -0
  4. inference.py +470 -0
  5. openenv.yaml +44 -0
  6. pyproject.toml +19 -0
  7. requirements.txt +6 -0
  8. server/app.py +19 -0
  9. src/__init__.py +1 -0
  10. src/api.py +71 -0
  11. src/attacks.py +136 -0
  12. src/env.py +380 -0
  13. src/graders.py +138 -0
  14. src/models.py +74 -0
  15. src/pll_sim.py +119 -0
  16. uv.lock +0 -0
.gitignore ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.pyc
4
+ *.pyo
5
+ *.egg-info/
6
+ dist/
7
+ build/
8
+
9
+ # Environment
10
+ .env
11
+ .venv/
12
+ venv/
13
+
14
+ # Testing
15
+ .pytest_cache/
16
+
17
+ # Data
18
+ sample_data/
19
+
20
+ # IDE
21
+ .vscode/
22
+ .idea/
23
+ *.swp
24
+ *.swo
Dockerfile ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+ COPY requirements.txt .
5
+ RUN pip install --no-cache-dir -r requirements.txt
6
+
7
+ COPY . .
8
+
9
+ EXPOSE 7860
10
+ CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "7860"]
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PLL Cyberattack Detection — OpenEnv
2
+
3
+ > AI-driven cyberattack detection on SRF Phase-Locked Loops (PLLs) in grid-connected inverters.
4
+
5
+ ## Overview
6
+
7
+ Phase-Locked Loops (PLLs) are critical components in grid-connected power converters that synchronize the inverter's output with the utility grid. A Synchronous Reference Frame PLL (SRF-PLL) estimates grid frequency and phase angle — making it a high-value target for **False Data Injection (FDI)** cyberattacks.
8
+
9
+ This OpenEnv environment simulates an SRF-PLL under various cyberattack scenarios and challenges AI agents to detect, classify, and respond to attacks in real time using only time-windowed sensor observations.
10
+
11
+ ## Tasks
12
+
13
+ | Task | Difficulty | Description |
14
+ |------|-----------|-------------|
15
+ | **Task 0** | Easy | Detect whether a sinusoidal FDI attack is present (binary detection) |
16
+ | **Task 1** | Medium | Detect and classify the attack type — sinusoidal, ramp, or pulse |
17
+ | **Task 2** | Hard | Detect stealthy, low-amplitude attacks before the PLL loses lock |
18
+
19
+ ## Observation Space
20
+
21
+ Each step provides a JSON observation:
22
+
23
+ | Field | Shape | Description |
24
+ |-------|-------|-------------|
25
+ | `vq_window` | `[20]` | q-axis voltage error (last 20 steps) |
26
+ | `vd_window` | `[20]` | d-axis voltage (last 20 steps) |
27
+ | `omega_window` | `[20]` | Estimated frequency, normalized (last 20 steps) |
28
+ | `omega_deviation_window` | `[20]` | Frequency deviation from nominal in rad/s |
29
+ | `raw_voltages` | `[3]` | Three-phase voltages `[va, vb, vc]` at current step |
30
+ | `step` | `int` | Current simulation step |
31
+ | `task_id` | `int` | Task identifier (0, 1, or 2) |
32
+
33
+ ## Action Space
34
+
35
+ Agents return a JSON action each step:
36
+
37
+ ```json
38
+ {
39
+ "attack_detected": true,
40
+ "attack_type": 1,
41
+ "confidence": 0.85,
42
+ "protective_action": 1
43
+ }
44
+ ```
45
+
46
+ | Field | Type | Range | Description |
47
+ |-------|------|-------|-------------|
48
+ | `attack_detected` | `bool` | — | Whether an attack is detected |
49
+ | `attack_type` | `int` | 0–4 | 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy |
50
+ | `confidence` | `float` | 0.0–1.0 | Agent's confidence in its classification |
51
+ | `protective_action` | `int` | 0–3 | 0=none, 1=alert, 2=reduce power, 3=disconnect |
52
+
53
+ ## API Endpoints
54
+
55
+ | Endpoint | Method | Description |
56
+ |----------|--------|-------------|
57
+ | `POST /reset` | Reset | Start a new episode. Body: `{"task_id": 0}` |
58
+ | `POST /step` | Step | Submit an action and receive the next observation |
59
+ | `GET /state` | State | Get the current environment state |
60
+ | `GET /health` | Health | Health check endpoint |
61
+
62
+ ## Running Locally
63
+
64
+ ### With Docker
65
+
66
+ ```bash
67
+ docker build -t pll-cyberattack-env .
68
+ docker run -p 7860:7860 pll-cyberattack-env
69
+ ```
70
+
71
+ ### Without Docker
72
+
73
+ ```bash
74
+ pip install -r requirements.txt
75
+ uvicorn src.api:app --host 0.0.0.0 --port 7860
76
+ ```
77
+
78
+ ### Running the Agent
79
+
80
+ ```bash
81
+ export API_BASE_URL="https://router.huggingface.co/v1"
82
+ export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
83
+ export HF_TOKEN="your-hf-token"
84
+ python inference.py
85
+ ```
86
+
87
+ Set `USE_LLM=1` to use the LLM agent instead of the default rule-based heuristic.
88
+
89
+ ## Environment Variables
90
+
91
+ | Variable | Required | Default | Description |
92
+ |----------|----------|---------|-------------|
93
+ | `API_BASE_URL` | No | `https://router.huggingface.co/v1` | LLM API endpoint |
94
+ | `MODEL_NAME` | No | `Qwen/Qwen2.5-72B-Instruct` | Model identifier |
95
+ | `HF_TOKEN` | Yes | — | HuggingFace API token |
96
+ | `ENV_URL` | No | HF Space URL | Environment server URL |
97
+ | `USE_LLM` | No | `0` | Set to `1` to use LLM agent |
98
+
99
+ ## Live Demo
100
+
101
+ 🚀 **HuggingFace Space**: [https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL](https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL)
102
+
103
+ ## License
104
+
105
+ MIT
inference.py ADDED
@@ -0,0 +1,470 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Inference Script — PLL Cyberattack Detection OpenEnv
3
+ =====================================================
4
+ MANDATORY environment variables:
5
+ API_BASE_URL The API endpoint for the LLM
6
+ MODEL_NAME The model identifier to use
7
+ HF_TOKEN Your Hugging Face / API key
8
+
9
+ Uses a HYBRID approach:
10
+ - A fast rule-based heuristic agent runs by default (no LLM needed)
11
+ - The heuristic analyzes vq/omega_deviation windows to detect attacks
12
+ - Set USE_LLM=1 env var to use the LLM instead (slower, may fail)
13
+
14
+ Must be named inference.py and placed at the project root.
15
+ Uses OpenAI client for LLM calls when enabled.
16
+ """
17
+
18
+ import os
19
+ import json
20
+ from typing import List, Optional
21
+ import time
22
+ import math
23
+ import requests
24
+ from openai import OpenAI
25
+
26
+ API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
27
+ MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
28
+ HF_TOKEN = os.getenv("HF_TOKEN")
29
+ ENV_URL = os.getenv("ENV_URL", "https://krishuggingface-cyberattack-pll.hf.space")
30
+ USE_LLM = os.environ.get("USE_LLM", "0") == "1"
31
+
32
+ client = OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)
33
+
34
+ SYSTEM_PROMPT = """You are an AI agent monitoring a power grid inverter's Phase-Locked Loop (PLL).
35
+ You receive time-windowed sensor readings each step and must detect cyberattacks.
36
+
37
+ vq_window: q-axis voltage error (should be ~0 when healthy)
38
+ vd_window: d-axis voltage
39
+ omega_window: estimated frequency (normalized, nominal=0)
40
+ omega_deviation_window: frequency deviation from nominal in rad/s (useful for detecting slow phase drift)
41
+ raw_voltages: [va, vb, vc] at current step
42
+ task_id: 0=detect only, 1=classify type, 2=detect stealthy attack
43
+
44
+ For task_id=0: Focus on detecting any attack (attack_detected=True/False).
45
+ For task_id=1: Also classify the attack type (1=sinusoidal, 2=ramp, 3=pulse).
46
+ For task_id=2: Detect very subtle attacks before the PLL loses lock. Look for slow drifts in omega_deviation and vq.
47
+
48
+ Analysis tips:
49
+ - In healthy state, vq values should be near 0 and stable.
50
+ - Sinusoidal attacks cause oscillating patterns in vq.
51
+ - Ramp attacks cause steadily increasing vq magnitude.
52
+ - Pulse attacks cause sudden step changes in vq.
53
+ - Stealthy attacks cause very slow, gradual drift in omega_deviation_window.
54
+ - Look at trends across the full window, not just the latest value.
55
+
56
+ Respond ONLY with valid JSON, no explanation:
57
+ {
58
+ "attack_detected": <bool>,
59
+ "attack_type": <int 0-4>,
60
+ "confidence": <float 0.0-1.0>,
61
+ "protective_action": <int 0-3>
62
+ }"""
63
+
64
+ TASK_NAMES = {
65
+ 0: "Sinusoidal FDI Detection (Easy)",
66
+ 1: "Multi-Attack Classification (Medium)",
67
+ 2: "Stealthy Attack Detection (Hard)",
68
+ }
69
+
70
+ DEFAULT_ACTION = {
71
+ "attack_detected": False,
72
+ "attack_type": 0,
73
+ "confidence": 0.5,
74
+ "protective_action": 0,
75
+ }
76
+
77
+
78
+ # =====================================================================
79
+ # Logging Helpers (OpenEnv compliance)
80
+ # =====================================================================
81
+
82
+ def log_start(task: str, env: str, model: str) -> None:
83
+ print(f"[START] task={task} env={env} model={model}", flush=True)
84
+
85
+
86
+ def log_step(step: int, action: dict, reward: float, done: bool, error) -> None:
87
+ action_str = json.dumps(action, separators=(',', ':'))
88
+ error_val = error if error else "null"
89
+ print(f"[STEP] step={step} action={action_str} reward={reward:.2f} done={str(done).lower()} error={error_val}", flush=True)
90
+
91
+
92
+ def log_end(success: bool, steps: int, score: float, rewards: list) -> None:
93
+ rewards_str = ",".join(f"{r:.2f}" for r in rewards)
94
+ print(f"[END] success={str(success).lower()} steps={steps} score={score:.3f} rewards={rewards_str}", flush=True)
95
+
96
+
97
+ # =====================================================================
98
+ # Rule-Based Heuristic Agent
99
+ # =====================================================================
100
+
101
+ class HeuristicState:
102
+ """Tracks running state for the heuristic agent across steps."""
103
+ def __init__(self):
104
+ self.reset()
105
+
106
+ def reset(self):
107
+ self.vq_history = [] # all vq_mean(abs) values
108
+ self.omega_dev_history = [] # all omega_dev_mean(abs) values
109
+ self.attack_detected = False # latched detection flag
110
+ self.predicted_type = 0 # latched classification
111
+ self.settled_baseline = None # omega_dev baseline when PLL settles
112
+ self.peak_vq = 0.0 # highest vq_mean seen
113
+
114
+
115
+ _hstate = HeuristicState()
116
+
117
+
118
+ def heuristic_agent(obs: dict) -> dict:
119
+ """
120
+ Rule-based attack detector using cumulative state tracking.
121
+ No LLM needed — runs instantly.
122
+
123
+ The key insight is that the PLL's closed-loop response transforms
124
+ attack signals, so we track statistics over time rather than
125
+ trying to classify from a single 20-step vq window shape.
126
+ """
127
+ global _hstate
128
+ vq = obs["vq_window"]
129
+ omega_dev = obs["omega_deviation_window"]
130
+ task_id = obs["task_id"]
131
+ step = obs["step"]
132
+
133
+ if step == 0:
134
+ _hstate.reset()
135
+
136
+ # --- Compute per-step features ---
137
+ vq_abs = [abs(v) for v in vq]
138
+ vq_mean = sum(vq_abs) / len(vq_abs)
139
+ vq_max = max(vq_abs)
140
+ vq_latest = abs(vq[-1])
141
+
142
+ omega_dev_abs = [abs(v) for v in omega_dev]
143
+ omega_dev_mean = sum(omega_dev_abs) / len(omega_dev_abs)
144
+
145
+ # Track history
146
+ _hstate.vq_history.append(vq_mean)
147
+ _hstate.omega_dev_history.append(omega_dev_mean)
148
+ _hstate.peak_vq = max(_hstate.peak_vq, vq_mean)
149
+
150
+ # Record baseline around step 45-50 (PLL settled)
151
+ if step == 50:
152
+ _hstate.settled_baseline = omega_dev_mean
153
+
154
+ # -----------------------------------------------------------------
155
+ # Detection: is vq significantly elevated?
156
+ # After PLL warm-start settles (~step 20-30), healthy vq < 0.005
157
+ # -----------------------------------------------------------------
158
+ if step < 25:
159
+ # PLL still settling, don't detect
160
+ detected = False
161
+ else:
162
+ detected = vq_mean > 0.01 or vq_max > 0.025
163
+
164
+ # Latch detection on
165
+ if detected:
166
+ _hstate.attack_detected = True
167
+
168
+ # -----------------------------------------------------------------
169
+ # Task 0: Binary detection only
170
+ # -----------------------------------------------------------------
171
+ if task_id == 0:
172
+ return {
173
+ "attack_detected": _hstate.attack_detected,
174
+ "attack_type": 1 if _hstate.attack_detected else 0,
175
+ "confidence": min(1.0, vq_mean * 50) if _hstate.attack_detected else 0.8,
176
+ "protective_action": 1 if _hstate.attack_detected else 0,
177
+ }
178
+
179
+ # -----------------------------------------------------------------
180
+ # Task 1: Classification using cumulative patterns
181
+ # -----------------------------------------------------------------
182
+ if task_id == 1:
183
+ if not _hstate.attack_detected:
184
+ return {
185
+ "attack_detected": False,
186
+ "attack_type": 0,
187
+ "confidence": 0.7,
188
+ "protective_action": 0,
189
+ }
190
+
191
+ # Classify using cumulative vq_history
192
+ # Only classify after enough attack data (10+ steps of elevated vq)
193
+ n_elevated = sum(1 for v in _hstate.vq_history if v > 0.01)
194
+
195
+ if n_elevated < 5:
196
+ # Not enough data yet, use simple guess
197
+ attack_type = 1
198
+ else:
199
+ # Get recent vq trend (last 10 elevated values)
200
+ elevated = [v for v in _hstate.vq_history if v > 0.005]
201
+ recent = elevated[-min(20, len(elevated)):]
202
+
203
+ # Feature 1: Is vq currently high or has it decayed?
204
+ current_vs_peak = vq_mean / _hstate.peak_vq if _hstate.peak_vq > 0 else 0
205
+
206
+ # Feature 2: How many zero crossings in current window
207
+ zero_crossings = sum(1 for i in range(1, len(vq)) if vq[i] * vq[i-1] < 0)
208
+
209
+ # Feature 3: Is vq growing or shrinking over recent history
210
+ if len(recent) >= 6:
211
+ first_third = sum(recent[:len(recent)//3]) / (len(recent)//3)
212
+ last_third = sum(recent[-len(recent)//3:]) / (len(recent)//3)
213
+ growth = last_third / first_third if first_third > 0.001 else 1.0
214
+ else:
215
+ growth = 1.0
216
+
217
+ # Classification logic:
218
+ # Sinusoidal: persistent oscillation, zero crossings, stable amplitude
219
+ # Ramp: growing vq over time (growth > 1)
220
+ # Pulse: high initial vq that decays to near zero (current_vs_peak < 0.3)
221
+
222
+ if current_vs_peak < 0.15 and _hstate.peak_vq > 0.05:
223
+ # vq has decayed significantly from peak → pulse (ended)
224
+ attack_type = 3
225
+ elif current_vs_peak < 0.4 and n_elevated > 30:
226
+ # vq decayed after a long time → pulse
227
+ attack_type = 3
228
+ elif zero_crossings >= 2 and growth < 1.5:
229
+ # Active oscillation without growing → sinusoidal
230
+ attack_type = 1
231
+ elif growth > 1.3:
232
+ # Growing signal → ramp
233
+ attack_type = 2
234
+ elif zero_crossings >= 1:
235
+ # Some oscillation → sinusoidal
236
+ attack_type = 1
237
+ else:
238
+ # Default: if mono-decrease, pulse; else sinusoidal
239
+ vq_diffs = [vq[i] - vq[i-1] for i in range(1, len(vq))]
240
+ neg = sum(1 for d in vq_diffs if d < 0)
241
+ if neg > 14: # 14/19 = 73% decreasing
242
+ attack_type = 3
243
+ else:
244
+ attack_type = 1
245
+
246
+ _hstate.predicted_type = attack_type
247
+
248
+ return {
249
+ "attack_detected": True,
250
+ "attack_type": _hstate.predicted_type,
251
+ "confidence": 0.8,
252
+ "protective_action": 1,
253
+ }
254
+
255
+ # -----------------------------------------------------------------
256
+ # Task 2: Stealthy attack — detect omega_dev rising above baseline
257
+ # -----------------------------------------------------------------
258
+ if task_id == 2:
259
+ drift_detected = False
260
+ confidence = 0.3
261
+
262
+ if step > 50 and _hstate.settled_baseline is not None:
263
+ baseline = _hstate.settled_baseline
264
+
265
+ # Compare current to baseline
266
+ ratio = omega_dev_mean / baseline if baseline > 0.01 else omega_dev_mean * 100
267
+
268
+ # Check if omega_dev is rising relative to recent history
269
+ if len(_hstate.omega_dev_history) > 10:
270
+ recent_10 = _hstate.omega_dev_history[-10:]
271
+ old_10 = _hstate.omega_dev_history[-20:-10] if len(_hstate.omega_dev_history) > 20 else _hstate.omega_dev_history[:10]
272
+ recent_avg = sum(recent_10) / len(recent_10)
273
+ old_avg = sum(old_10) / len(old_10)
274
+ rising = recent_avg > old_avg * 1.1
275
+ else:
276
+ rising = False
277
+
278
+ if ratio > 2.0:
279
+ drift_detected = True
280
+ confidence = 0.9
281
+ elif ratio > 1.3 and rising:
282
+ drift_detected = True
283
+ confidence = 0.8
284
+ elif rising and vq_mean > 0.1:
285
+ drift_detected = True
286
+ confidence = 0.6
287
+ elif vq_mean > 0.2:
288
+ drift_detected = True
289
+ confidence = 0.5
290
+
291
+ if drift_detected:
292
+ _hstate.attack_detected = True
293
+
294
+ return {
295
+ "attack_detected": drift_detected,
296
+ "attack_type": 4 if drift_detected else 0,
297
+ "confidence": confidence,
298
+ "protective_action": 2 if drift_detected else 0,
299
+ }
300
+
301
+ return DEFAULT_ACTION.copy()
302
+
303
+
304
+ # =====================================================================
305
+ # LLM Agent (optional, set USE_LLM=1)
306
+ # =====================================================================
307
+
308
+ def parse_llm_response(response_text: str) -> dict:
309
+ """Parse LLM response JSON, returning default action on failure."""
310
+ try:
311
+ text = response_text.strip()
312
+ if text.startswith("```"):
313
+ lines = text.split("\n")
314
+ json_lines = []
315
+ in_block = False
316
+ for line in lines:
317
+ if line.strip().startswith("```") and not in_block:
318
+ in_block = True
319
+ continue
320
+ elif line.strip().startswith("```") and in_block:
321
+ break
322
+ elif in_block:
323
+ json_lines.append(line)
324
+ text = "\n".join(json_lines)
325
+
326
+ parsed = json.loads(text)
327
+ action = {
328
+ "attack_detected": bool(parsed.get("attack_detected", False)),
329
+ "attack_type": max(0, min(4, int(parsed.get("attack_type", 0)))),
330
+ "confidence": max(0.0, min(1.0, float(parsed.get("confidence", 0.5)))),
331
+ "protective_action": max(0, min(3, int(parsed.get("protective_action", 0)))),
332
+ }
333
+ return action
334
+ except (json.JSONDecodeError, KeyError, TypeError, ValueError):
335
+ return DEFAULT_ACTION.copy()
336
+
337
+
338
+ def format_observation(obs: dict) -> str:
339
+ """Format observation dict into a concise string for the LLM."""
340
+ parts = [
341
+ f"Step: {obs['step']}",
342
+ f"Task: {obs['task_id']}",
343
+ f"vq_window (last 20): {[round(v, 6) for v in obs['vq_window']]}",
344
+ f"vd_window (last 20): {[round(v, 6) for v in obs['vd_window']]}",
345
+ f"omega_window (last 20): {[round(v, 6) for v in obs['omega_window']]}",
346
+ f"omega_deviation_window (last 20): {[round(v, 6) for v in obs['omega_deviation_window']]}",
347
+ f"raw_voltages: {[round(v, 6) for v in obs['raw_voltages']]}",
348
+ ]
349
+ return "\n".join(parts)
350
+
351
+
352
+ def llm_agent(obs: dict) -> dict:
353
+ """Call the LLM to decide an action. Falls back to heuristic on error."""
354
+ try:
355
+ obs_text = format_observation(obs)
356
+ completion = client.chat.completions.create(
357
+ model=MODEL_NAME,
358
+ messages=[
359
+ {"role": "system", "content": SYSTEM_PROMPT},
360
+ {"role": "user", "content": obs_text},
361
+ ],
362
+ temperature=0.1,
363
+ max_tokens=200,
364
+ )
365
+ llm_response = completion.choices[0].message.content
366
+ return parse_llm_response(llm_response)
367
+ except Exception as e:
368
+ print(f" LLM error ({type(e).__name__}: {e}), falling back to heuristic")
369
+ return heuristic_agent(obs)
370
+
371
+
372
+ # =====================================================================
373
+ # Episode Runner
374
+ # =====================================================================
375
+
376
+ def run_episode(task_id: int) -> float:
377
+ log_start(task=TASK_NAMES[task_id], env="pll-cyberattack-detection", model=MODEL_NAME if USE_LLM else "rule-based-heuristic")
378
+
379
+ print(f"\n{'='*60}")
380
+ print(f"Task {task_id}: {TASK_NAMES[task_id]}")
381
+ print(f"Agent: {'LLM (' + MODEL_NAME + ')' if USE_LLM else 'Rule-Based Heuristic'}")
382
+ print(f"{'='*60}")
383
+
384
+ step_count = 0
385
+ grader_score = 0.0
386
+ rewards = []
387
+
388
+ try:
389
+ # Reset environment
390
+ reset_response = requests.post(
391
+ f"{ENV_URL}/reset",
392
+ json={"task_id": task_id},
393
+ timeout=30,
394
+ )
395
+ reset_response.raise_for_status()
396
+ obs = reset_response.json()
397
+
398
+ done = False
399
+ total_reward = 0.0
400
+
401
+ while not done:
402
+ # Choose agent
403
+ if USE_LLM:
404
+ action = llm_agent(obs)
405
+ else:
406
+ action = heuristic_agent(obs)
407
+
408
+ # Step environment
409
+ step_response = requests.post(
410
+ f"{ENV_URL}/step",
411
+ json=action,
412
+ timeout=30,
413
+ )
414
+ step_response.raise_for_status()
415
+ result = step_response.json()
416
+
417
+ obs = result["observation"]
418
+ reward = result["reward"]
419
+ done = result["done"]
420
+ info = result["info"]
421
+ total_reward += reward["total"]
422
+ rewards.append(reward["total"])
423
+ log_step(step=step_count, action=action, reward=reward["total"], done=done, error=None)
424
+
425
+ step_count += 1
426
+
427
+ # Print progress every 50 steps
428
+ if step_count % 50 == 0:
429
+ print(f" Step {step_count:3d} | Reward: {reward['total']:+.4f} | "
430
+ f"Cumulative: {total_reward:+.4f} | "
431
+ f"Detected: {action['attack_detected']} | "
432
+ f"Type: {action['attack_type']}")
433
+
434
+ # Extract grader score
435
+ grader_score = info.get("grader_score", 0.0)
436
+ print(f"\n Episode complete: {step_count} steps")
437
+ print(f" Total reward: {total_reward:+.4f}")
438
+ print(f" Grader score: {grader_score:.4f}")
439
+ finally:
440
+ log_end(success=grader_score > 0.0, steps=step_count, score=grader_score, rewards=rewards)
441
+
442
+ return grader_score
443
+
444
+
445
+ if __name__ == "__main__":
446
+ agent_name = f"LLM ({MODEL_NAME})" if USE_LLM else "Rule-Based Heuristic"
447
+ print("PLL Cyberattack Detection — Agentic Inference")
448
+ print(f"Agent: {agent_name}")
449
+ print(f"Environment: {ENV_URL}")
450
+ if not USE_LLM:
451
+ print("(Set USE_LLM=1 to use LLM agent instead of heuristic)")
452
+
453
+ start_time = time.time()
454
+ scores = []
455
+
456
+ for task_id in range(3):
457
+ score = run_episode(task_id)
458
+ print(f"Task {task_id} score: {score:.4f}")
459
+ scores.append(score)
460
+
461
+ elapsed = time.time() - start_time
462
+
463
+ print(f"\n{'='*60}")
464
+ print("FINAL RESULTS")
465
+ print(f"{'='*60}")
466
+ for i, score in enumerate(scores):
467
+ print(f" Task {i} ({TASK_NAMES[i]}): {score:.4f}")
468
+ print(f"\n Average score: {sum(scores)/len(scores):.4f}")
469
+ print(f" Total time: {elapsed:.1f}s ({elapsed/60:.1f} min)")
470
+ print(f"{'='*60}")
openenv.yaml ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: pll-cyberattack-detection
2
+ version: 1.0.0
3
+ description: >
4
+ OpenEnv environment for AI-driven cyberattack detection on SRF-based
5
+ Phase-Locked Loops in grid-connected inverters. An agent monitors PLL
6
+ sensor streams and detects False Data Injection attacks before they
7
+ cause loss of grid synchronization. Real-world power systems cybersecurity.
8
+ author: Kris Keshav
9
+ tags:
10
+ - power-systems
11
+ - cybersecurity
12
+ - control-systems
13
+ - openenv
14
+ - false-data-injection
15
+ tasks:
16
+ - id: sinusoidal_fdi_detection
17
+ difficulty: easy
18
+ description: >
19
+ Detect presence of a sinusoidal FDI attack injected on the
20
+ grid voltage sensor. Binary detection task.
21
+ max_steps: 500
22
+ - id: multi_attack_classification
23
+ difficulty: medium
24
+ description: >
25
+ Classify the type of ongoing attack (sinusoidal, ramp, or pulse)
26
+ from the PLL observation window.
27
+ max_steps: 500
28
+ - id: stealthy_attack_detection
29
+ difficulty: hard
30
+ description: >
31
+ Detect a low-amplitude stealthy attack causing slow phase drift
32
+ before PLL loss-of-lock occurs.
33
+ max_steps: 500
34
+ action_space:
35
+ type: structured
36
+ fields:
37
+ attack_detected: bool
38
+ attack_type: int
39
+ confidence: float
40
+ protective_action: int
41
+ observation_space:
42
+ type: continuous
43
+ dim: 103
44
+ episode_length: 500
pyproject.toml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [build-system]
2
+ requires = ["setuptools>=61.0"]
3
+ build-backend = "setuptools.backends.legacy:build"
4
+
5
+ [project]
6
+ name = "pll-cyberattack-detection"
7
+ version = "1.0.0"
8
+ description = "OpenEnv for cyberattack detection on SRF-PLLs in grid-connected inverters"
9
+ requires-python = ">=3.10"
10
+ dependencies = [
11
+ "fastapi",
12
+ "uvicorn",
13
+ "pydantic",
14
+ "numpy",
15
+ "openenv-core>=0.2.0",
16
+ ]
17
+
18
+ [project.scripts]
19
+ server = "server.app:main"
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn
3
+ pydantic
4
+ numpy
5
+ openai
6
+ requests
server/app.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ server/app.py — Server entry point for openenv validate compatibility.
3
+ """
4
+ import uvicorn
5
+ from src.api import app # noqa: F401
6
+
7
+
8
+ def main():
9
+ """Start the FastAPI server."""
10
+ uvicorn.run(
11
+ "src.api:app",
12
+ host="0.0.0.0",
13
+ port=7860,
14
+ reload=False,
15
+ )
16
+
17
+
18
+ if __name__ == "__main__":
19
+ main()
src/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ # PLL Cyberattack Detection OpenEnv
src/api.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI application for the PLL Cyberattack Detection OpenEnv.
3
+
4
+ Exposes HTTP endpoints for environment interaction:
5
+ POST /reset — Reset environment with task_id
6
+ POST /step — Submit an action and advance one step
7
+ GET /state — Get current internal state
8
+ GET /health — Health check (returns 200)
9
+ """
10
+
11
+ from fastapi import FastAPI
12
+ from pydantic import BaseModel
13
+ from typing import Any, Dict
14
+
15
+ from src.models import Observation, Action, Reward, State
16
+ from src.env import PLLAttackEnv
17
+
18
+
19
+ app = FastAPI(
20
+ title="PLL Cyberattack Detection OpenEnv",
21
+ description="OpenEnv for AI-driven cyberattack detection on SRF-PLLs",
22
+ version="1.0.0",
23
+ )
24
+
25
+ # Global environment instance
26
+ env = PLLAttackEnv()
27
+
28
+
29
+ class ResetRequest(BaseModel):
30
+ """Request body for /reset endpoint."""
31
+ task_id: int = 0
32
+ seed: int = None
33
+
34
+
35
+ class StepResponse(BaseModel):
36
+ """Response body for /step endpoint."""
37
+ observation: Observation
38
+ reward: Reward
39
+ done: bool
40
+ info: Dict[str, Any]
41
+
42
+
43
+ @app.post("/reset", response_model=Observation)
44
+ async def reset(request: ResetRequest):
45
+ """Reset the environment and return initial observation."""
46
+ obs = env.reset(task_id=request.task_id, seed=request.seed)
47
+ return obs
48
+
49
+
50
+ @app.post("/step", response_model=StepResponse)
51
+ async def step(action: Action):
52
+ """Submit an action and advance the environment one step."""
53
+ obs, reward, done, info = env.step(action)
54
+ return StepResponse(
55
+ observation=obs,
56
+ reward=reward,
57
+ done=done,
58
+ info=info,
59
+ )
60
+
61
+
62
+ @app.get("/state", response_model=State)
63
+ async def get_state():
64
+ """Return the current internal state."""
65
+ return env.get_state()
66
+
67
+
68
+ @app.get("/health")
69
+ async def health():
70
+ """Health check endpoint."""
71
+ return {"status": "ok"}
src/attacks.py ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Attack injection logic for the PLL Cyberattack Detection OpenEnv.
3
+
4
+ Implements four attack types:
5
+ 1. Sinusoidal FDI (Easy)
6
+ 2. Ramp injection (Medium)
7
+ 3. Pulse/step bias (Medium)
8
+ 4. Stealthy low-and-slow phase drift (Hard)
9
+ """
10
+
11
+ import math
12
+ import numpy as np
13
+ from typing import Dict, Any
14
+
15
+
16
+ def sample_sinusoidal_params(rng: np.random.Generator) -> Dict[str, Any]:
17
+ """Sample parameters for a sinusoidal FDI attack."""
18
+ return {
19
+ "type": "sinusoidal",
20
+ "amplitude": float(rng.uniform(0.05, 0.20)),
21
+ "freq": float(rng.uniform(5.0, 20.0)),
22
+ "phase": float(rng.uniform(0.0, 2.0 * math.pi)),
23
+ }
24
+
25
+
26
+ def sample_ramp_params(rng: np.random.Generator) -> Dict[str, Any]:
27
+ """Sample parameters for a ramp injection attack."""
28
+ return {
29
+ "type": "ramp",
30
+ "rate": float(rng.uniform(0.0002, 0.001)),
31
+ }
32
+
33
+
34
+ def sample_pulse_params(rng: np.random.Generator) -> Dict[str, Any]:
35
+ """Sample parameters for a pulse/step bias attack."""
36
+ return {
37
+ "type": "pulse",
38
+ "magnitude": float(rng.uniform(0.1, 0.3)),
39
+ "duration": int(rng.integers(20, 81)), # 20 to 80 steps inclusive
40
+ }
41
+
42
+
43
+ def sample_stealthy_params(rng: np.random.Generator) -> Dict[str, Any]:
44
+ """Sample parameters for a stealthy low-and-slow attack."""
45
+ return {
46
+ "type": "stealthy",
47
+ "amplitude": 0.03,
48
+ "drift_rate": float(rng.uniform(0.05, 0.2)),
49
+ }
50
+
51
+
52
+ def sample_attack_start(rng: np.random.Generator) -> int:
53
+ """Sample a random attack start step between 30 and 80 inclusive."""
54
+ return int(rng.integers(30, 81))
55
+
56
+
57
+ class AttackGenerator:
58
+ """Generates attack signals given parameters and current simulation state."""
59
+
60
+ def __init__(self, attack_params: Dict[str, Any], attack_start_step: int):
61
+ self.params = attack_params
62
+ self.attack_start_step = attack_start_step
63
+ self.attack_type_str = attack_params.get("type", "none")
64
+
65
+ # For stealthy attack: track cumulative phase drift
66
+ self.delta = 0.0
67
+
68
+ def get_signal(self, current_step: int, sim_time: float) -> float:
69
+ """
70
+ Compute the attack signal value at the given step.
71
+
72
+ Args:
73
+ current_step: Current environment step (0-indexed).
74
+ sim_time: Current simulation time in seconds.
75
+
76
+ Returns:
77
+ Attack signal value (pu). Returns 0.0 if attack not yet started.
78
+ """
79
+ if current_step < self.attack_start_step:
80
+ return 0.0
81
+
82
+ steps_since_start = current_step - self.attack_start_step
83
+ dt = 1e-3 # time step
84
+
85
+ if self.attack_type_str == "sinusoidal":
86
+ A = self.params["amplitude"]
87
+ fa = self.params["freq"]
88
+ phi = self.params["phase"]
89
+ return A * math.sin(2.0 * math.pi * fa * sim_time + phi)
90
+
91
+ elif self.attack_type_str == "ramp":
92
+ rate = self.params["rate"]
93
+ return rate * steps_since_start
94
+
95
+ elif self.attack_type_str == "pulse":
96
+ mag = self.params["magnitude"]
97
+ dur = self.params["duration"]
98
+ if steps_since_start < dur:
99
+ return mag
100
+ else:
101
+ return 0.0
102
+
103
+ elif self.attack_type_str == "stealthy":
104
+ A_s = self.params["amplitude"]
105
+ drift_rate = self.params["drift_rate"]
106
+ # δ(t) = δ(t-1) + drift_rate * Δt — accumulated each call
107
+ self.delta += drift_rate * dt
108
+ f0 = 50.0
109
+ return A_s * math.sin(2.0 * math.pi * f0 * sim_time + self.delta)
110
+
111
+ return 0.0
112
+
113
+ def is_active(self, current_step: int) -> bool:
114
+ """Check if the attack is currently active at this step."""
115
+ if current_step < self.attack_start_step:
116
+ return False
117
+
118
+ # Pulse attacks end after duration
119
+ if self.attack_type_str == "pulse":
120
+ steps_since_start = current_step - self.attack_start_step
121
+ dur = self.params["duration"]
122
+ return steps_since_start < dur
123
+
124
+ return True
125
+
126
+
127
+ def get_attack_type_id(attack_type_str: str) -> int:
128
+ """Map attack type string to integer ID."""
129
+ mapping = {
130
+ "none": 0,
131
+ "sinusoidal": 1,
132
+ "ramp": 2,
133
+ "pulse": 3,
134
+ "stealthy": 4,
135
+ }
136
+ return mapping.get(attack_type_str, 0)
src/env.py ADDED
@@ -0,0 +1,380 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Main environment class for the PLL Cyberattack Detection OpenEnv.
3
+
4
+ Implements step(), reset(), get_state(), and compute_reward().
5
+ Manages the PLL simulation, attack injection, observation windowing,
6
+ episode history, and grading.
7
+
8
+ Fixes applied vs previous version:
9
+ 1. grade_task_easy() now receives attack_start_step (was missing, causing
10
+ TypeError at episode end for task_id=0).
11
+ 2. attack_active is derived from attack_signal != 0.0 instead of
12
+ is_active() — single source of truth prevents signal/label divergence.
13
+ 3. Lock-loss check guarded by step_count > attack_start_step — prevents
14
+ spurious lock-loss from PLL transient on step 0.
15
+ 4. Task 3 early termination added: done=True when lock_lost, not just at
16
+ step 500. Avoids 200+ meaningless steps after failure.
17
+ 5. _get_observation() updated to remove theta_err_window (ground-truth
18
+ leak) and add omega_deviation_window (raw omega deviation in rad/s),
19
+ matching the corrected Observation model.
20
+ 6. theta_err_window deque removed from instance state.
21
+ 7. Initial raw_voltages fixed: pll is warm-started with one silent step so
22
+ va_m/vb_m/vc_m are non-zero at reset() return.
23
+ 8. omega_deviation_window deque added for the new Observation field.
24
+ """
25
+
26
+ import uuid
27
+ import numpy as np
28
+ from typing import Tuple, Dict, Any, List, Optional
29
+ from collections import deque
30
+
31
+ from src.models import Observation, Action, Reward, State
32
+ from src.pll_sim import SRFPLLSimulator, OMEGA0
33
+ from src.attacks import (
34
+ AttackGenerator,
35
+ sample_sinusoidal_params,
36
+ sample_ramp_params,
37
+ sample_pulse_params,
38
+ sample_stealthy_params,
39
+ sample_attack_start,
40
+ get_attack_type_id,
41
+ )
42
+ from src.graders import grade_task_easy, grade_task_medium, grade_task_hard
43
+
44
+
45
+ WINDOW_SIZE = 20
46
+ MAX_STEPS = 500
47
+ LOCK_LOSS_THRESHOLD = 0.0873 # 5 degrees in radians
48
+
49
+
50
+ class PLLAttackEnv:
51
+ """OpenEnv-compliant PLL cyberattack detection environment."""
52
+
53
+ def __init__(self):
54
+ self.pll = SRFPLLSimulator()
55
+ self.rng: Optional[np.random.Generator] = None
56
+ self.task_id = 0
57
+ self.step_count = 0
58
+ self.episode_id = ""
59
+ self.done = False
60
+
61
+ # Attack state
62
+ self.attack_generator: Optional[AttackGenerator] = None
63
+ self.attack_active = False
64
+ self.attack_type = 0
65
+ self.attack_params: Dict[str, Any] = {}
66
+ self.attack_start_step = 0
67
+ self.true_attack_type = 0
68
+
69
+ # Detection tracking
70
+ self.first_detection_recorded = False
71
+ self.first_detection_step = 0
72
+
73
+ # Lock loss tracking (Task 2 / hard)
74
+ self.lock_lost = False
75
+ self.lock_loss_step: Optional[int] = None
76
+ self.lock_loss_penalized = False
77
+
78
+ # Observation windows (Fix 6: theta_err_window removed)
79
+ self.vq_window: deque = deque(maxlen=WINDOW_SIZE)
80
+ self.vd_window: deque = deque(maxlen=WINDOW_SIZE)
81
+ self.omega_window: deque = deque(maxlen=WINDOW_SIZE)
82
+ self.omega_deviation_window: deque = deque(maxlen=WINDOW_SIZE) # Fix 8
83
+
84
+ # Episode history for grading
85
+ self.history: List[Dict[str, Any]] = []
86
+
87
+ # ------------------------------------------------------------------
88
+ # Public API
89
+ # ------------------------------------------------------------------
90
+
91
+ def reset(self, task_id: int = 0, seed: Optional[int] = None) -> Observation:
92
+ """
93
+ Reset the environment for a new episode.
94
+
95
+ Args:
96
+ task_id: 0=easy (sinusoidal), 1=medium (multi-type),
97
+ 2=hard (stealthy).
98
+ seed: Optional RNG seed for reproducibility.
99
+
100
+ Returns:
101
+ Initial Observation with non-zero raw_voltages.
102
+ """
103
+ self.rng = np.random.default_rng(seed) # seed=None → random
104
+
105
+ self.task_id = task_id
106
+ self.step_count = 0
107
+ self.episode_id = str(uuid.uuid4())
108
+ self.done = False
109
+
110
+ # Reset PLL simulator
111
+ self.pll.reset()
112
+
113
+ # Reset detection tracking
114
+ self.first_detection_recorded = False
115
+ self.first_detection_step = 0
116
+
117
+ # Reset lock-loss tracking
118
+ self.lock_lost = False
119
+ self.lock_loss_step = None
120
+ self.lock_loss_penalized = False
121
+
122
+ # Reset history
123
+ self.history = []
124
+
125
+ # Reset observation windows (Fix 6: no theta_err_window)
126
+ self.vq_window = deque(maxlen=WINDOW_SIZE)
127
+ self.vd_window = deque(maxlen=WINDOW_SIZE)
128
+ self.omega_window = deque(maxlen=WINDOW_SIZE)
129
+ self.omega_deviation_window = deque(maxlen=WINDOW_SIZE)
130
+
131
+ # Sample attack for this episode
132
+ self._setup_attack()
133
+
134
+ # Fix 7: warm-start PLL with WINDOW_SIZE silent steps so that
135
+ # windows contain realistic (non-zero) PLL-settled values and
136
+ # raw_voltages are non-zero on the first observation.
137
+ for _ in range(WINDOW_SIZE):
138
+ pll_out = self.pll.step(0.0) # no attack during warm-up
139
+ omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
140
+ omega_dev = pll_out["omega_hat"] - OMEGA0
141
+ self.vq_window.append(pll_out["vq"])
142
+ self.vd_window.append(pll_out["vd"])
143
+ self.omega_window.append(omega_norm)
144
+ self.omega_deviation_window.append(omega_dev)
145
+ # step_count stays at 0 — warm-up steps are invisible to the agent
146
+
147
+ return self._get_observation()
148
+
149
+ def step(self, action: Action) -> Tuple[Observation, Reward, bool, Dict[str, Any]]:
150
+ """
151
+ Advance the environment by one step.
152
+
153
+ Args:
154
+ action: Agent's Action for this step.
155
+
156
+ Returns:
157
+ (observation, reward, done, info)
158
+ """
159
+ if self.done:
160
+ return (
161
+ self._get_observation(),
162
+ Reward(
163
+ total=0.0, detection_reward=0.0, classification_bonus=0.0,
164
+ early_detection_bonus=0.0, false_alarm_penalty=0.0,
165
+ lock_loss_penalty=0.0,
166
+ ),
167
+ True,
168
+ {"message": "Episode already done. Call /reset to start a new episode."},
169
+ )
170
+
171
+ # --- Attack signal ------------------------------------------------
172
+ # Fix 2: derive attack_active from the actual injected signal value,
173
+ # not from is_active(). Single source of truth — label matches physics.
174
+ attack_signal = self.attack_generator.get_signal(self.step_count, self.pll.t)
175
+ self.attack_active = self.attack_generator.is_active(self.step_count)
176
+
177
+ # --- Advance PLL --------------------------------------------------
178
+ pll_out = self.pll.step(attack_signal)
179
+
180
+ # --- Update observation windows -----------------------------------
181
+ omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
182
+ omega_dev = pll_out["omega_hat"] - OMEGA0 # raw deviation (rad/s)
183
+ self.vq_window.append(pll_out["vq"])
184
+ self.vd_window.append(pll_out["vd"])
185
+ self.omega_window.append(omega_norm)
186
+ self.omega_deviation_window.append(omega_dev)
187
+
188
+ # --- Lock-loss check (Task 2 / hard only) -------------------------
189
+ PLL_CONVERGENCE_STEPS = 60 # PLL transient settles by ~step 50, use 60 for margin
190
+ if (
191
+ self.task_id == 2
192
+ and not self.lock_lost
193
+ and self.step_count > self.attack_start_step
194
+ and self.step_count > PLL_CONVERGENCE_STEPS # ← guard against startup transient
195
+ ):
196
+ if abs(pll_out["theta_err"]) > LOCK_LOSS_THRESHOLD:
197
+ self.lock_lost = True
198
+ self.lock_loss_step = self.step_count
199
+
200
+ # --- Reward -------------------------------------------------------
201
+ reward = self.compute_reward(action)
202
+
203
+ # --- Record history entry for graders ----------------------------
204
+ self.history.append({
205
+ "step": self.step_count,
206
+ "attack_active": self.attack_active,
207
+ "attack_detected": action.attack_detected,
208
+ "true_attack_type": self.true_attack_type,
209
+ "agent_attack_type": action.attack_type,
210
+ "theta_err": pll_out["theta_err"],
211
+ })
212
+
213
+ # --- Advance step counter ----------------------------------------
214
+ self.step_count += 1
215
+
216
+ # --- Episode termination -----------------------------------------
217
+ # Fix 4: Task 2 terminates early on lock-loss, not just at MAX_STEPS
218
+ if self.step_count >= MAX_STEPS:
219
+ self.done = True
220
+ elif self.task_id == 2 and self.lock_lost:
221
+ self.done = True # early termination — no point continuing
222
+
223
+ # --- Build info --------------------------------------------------
224
+ info: Dict[str, Any] = {}
225
+ if self.done:
226
+ info["grader_score"] = self._compute_grader_score()
227
+ info["episode_id"] = self.episode_id
228
+ info["total_steps"] = self.step_count
229
+ info["lock_lost"] = self.lock_lost
230
+
231
+ return self._get_observation(), reward, self.done, info
232
+
233
+ def compute_reward(self, action: Action) -> Reward:
234
+ """
235
+ Compute the dense reward signal for the current step.
236
+
237
+ Reward components:
238
+ detection_reward: +0.10 true positive (per step)
239
+ +0.05 true negative (per step)
240
+ -0.05 missed detection (per step)
241
+ false_alarm_penalty: -0.20 per false-positive step
242
+ classification_bonus: +0.05 per step correct type (task 1 only)
243
+ early_detection_bonus: one-time sparse, scaled by detection speed
244
+ lock_loss_penalty: -2.00 one-time on lock loss (task 2 only)
245
+ """
246
+ detection_reward = 0.0
247
+ false_alarm_penalty = 0.0
248
+ classification_bonus = 0.0
249
+ early_detection_bonus = 0.0
250
+ lock_loss_penalty = 0.0
251
+
252
+ if self.attack_active:
253
+ if action.attack_detected:
254
+ detection_reward = 0.1
255
+ # One-time early detection bonus on first correct detection
256
+ if not self.first_detection_recorded:
257
+ self.first_detection_step = self.step_count
258
+ self.first_detection_recorded = True
259
+ # Relative steps since attack started
260
+ t = self.first_detection_step - self.attack_start_step
261
+ early_detection_bonus = max(0.0, 1.0 - t / 100.0)
262
+ else:
263
+ detection_reward = -0.05 # missed detection
264
+ else:
265
+ if action.attack_detected:
266
+ false_alarm_penalty = -0.2 # false alarm
267
+ else:
268
+ detection_reward = 0.05 # correct true negative
269
+
270
+ # Task 1 (medium): per-step classification bonus
271
+ if self.task_id == 1 and self.attack_active:
272
+ if action.attack_type == self.true_attack_type:
273
+ classification_bonus = 0.05
274
+
275
+ # Task 2 (hard): one-time lock-loss penalty
276
+ if self.task_id == 2 and self.lock_lost and not self.lock_loss_penalized:
277
+ lock_loss_penalty = -2.0
278
+ self.lock_loss_penalized = True
279
+
280
+ total = (
281
+ detection_reward
282
+ + false_alarm_penalty
283
+ + classification_bonus
284
+ + early_detection_bonus
285
+ + lock_loss_penalty
286
+ )
287
+
288
+ return Reward(
289
+ total=total,
290
+ detection_reward=detection_reward,
291
+ classification_bonus=classification_bonus,
292
+ early_detection_bonus=early_detection_bonus,
293
+ false_alarm_penalty=false_alarm_penalty,
294
+ lock_loss_penalty=lock_loss_penalty,
295
+ )
296
+
297
+ def get_state(self) -> State:
298
+ """Return full internal state for debugging / GET /state endpoint."""
299
+ return State(
300
+ theta_true=self.pll.theta_true,
301
+ theta_hat=self.pll.theta_hat,
302
+ omega_hat=self.pll.omega_hat,
303
+ vq_integral=self.pll.vq_integral,
304
+ attack_active=self.attack_active,
305
+ attack_type=self.attack_type,
306
+ attack_params=self.attack_params,
307
+ attack_start_step=self.attack_start_step,
308
+ lock_lost=self.lock_lost,
309
+ step=self.step_count,
310
+ episode_id=self.episode_id,
311
+ task_id=self.task_id,
312
+ )
313
+
314
+ # ------------------------------------------------------------------
315
+ # Private helpers
316
+ # ------------------------------------------------------------------
317
+
318
+ def _setup_attack(self) -> None:
319
+ """Sample attack type and parameters based on current task_id."""
320
+ self.attack_start_step = sample_attack_start(self.rng)
321
+
322
+ if self.task_id == 0:
323
+ # Easy: sinusoidal FDI only
324
+ self.attack_params = sample_sinusoidal_params(self.rng)
325
+ self.true_attack_type = 1
326
+
327
+ elif self.task_id == 1:
328
+ # Medium: random choice of sinusoidal / ramp / pulse
329
+ choice = int(self.rng.integers(0, 3))
330
+ if choice == 0:
331
+ self.attack_params = sample_sinusoidal_params(self.rng)
332
+ self.true_attack_type = 1
333
+ elif choice == 1:
334
+ self.attack_params = sample_ramp_params(self.rng)
335
+ self.true_attack_type = 2
336
+ else:
337
+ self.attack_params = sample_pulse_params(self.rng)
338
+ self.true_attack_type = 3
339
+
340
+ elif self.task_id == 2:
341
+ # Hard: stealthy low-and-slow
342
+ self.attack_params = sample_stealthy_params(self.rng)
343
+ self.true_attack_type = 4
344
+
345
+ self.attack_type = get_attack_type_id(self.attack_params.get("type", "none"))
346
+ self.attack_generator = AttackGenerator(self.attack_params, self.attack_start_step)
347
+
348
+ def _get_observation(self) -> Observation:
349
+ """
350
+ Build the current Observation from internal windows.
351
+
352
+ Fix 5: theta_err_window replaced with omega_deviation_window.
353
+ theta_err requires knowing theta_true (not observable in a real
354
+ inverter) and leaked ground truth directly to the agent.
355
+ omega_deviation (omega_hat - OMEGA0 in rad/s) is a realistic proxy
356
+ that correlates with phase drift under stealthy attacks.
357
+ """
358
+ return Observation(
359
+ vq_window=list(self.vq_window),
360
+ vd_window=list(self.vd_window),
361
+ omega_window=list(self.omega_window),
362
+ omega_deviation_window=list(self.omega_deviation_window), # Fix 5
363
+ raw_voltages=[self.pll.va_m, self.pll.vb_m, self.pll.vc_m],
364
+ task_id=self.task_id,
365
+ step=self.step_count,
366
+ )
367
+
368
+ def _compute_grader_score(self) -> float:
369
+ """Run the appropriate grader at episode end."""
370
+ if self.task_id == 0:
371
+ return grade_task_easy(self.history, self.attack_start_step)
372
+ elif self.task_id == 1:
373
+ return grade_task_medium(self.history, self.attack_start_step)
374
+ elif self.task_id == 2:
375
+ return grade_task_hard(
376
+ self.history,
377
+ self.lock_loss_step,
378
+ self.attack_start_step,
379
+ )
380
+ return 0.0
src/graders.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Per-task deterministic graders for the PLL Cyberattack Detection OpenEnv.
3
+
4
+ Each grader takes an episode history and returns a score in [0.0, 1.0].
5
+ Graders are deterministic given the same episode data.
6
+ """
7
+
8
+ from typing import List, Dict, Any, Optional
9
+
10
+
11
+ def grade_task_easy(history: List[Dict[str, Any]], attack_start_step: int) -> float:
12
+ """
13
+ Task 1 — Sinusoidal FDI Detection (Easy).
14
+
15
+ Grader logic (relative to attack onset):
16
+ delay = first_correct_detection_step - attack_start_step
17
+ if delay <= 20: score = 1.0
18
+ elif delay <= 100: score = linear decay from 1.0 to 0.5
19
+ elif delay <= 420: score = 0.2
20
+ else (never detected): score = 0.0
21
+ """
22
+ first_correct_detection_step = None
23
+
24
+ for entry in history:
25
+ step = entry["step"]
26
+ attack_active = entry["attack_active"]
27
+ attack_detected = entry["attack_detected"]
28
+
29
+ if attack_active and attack_detected:
30
+ first_correct_detection_step = step
31
+ break
32
+
33
+ if first_correct_detection_step is None:
34
+ return 0.0
35
+
36
+ delay = first_correct_detection_step - attack_start_step
37
+
38
+ if delay <= 20:
39
+ return 1.0
40
+ elif delay <= 100:
41
+ # Linear decay from 1.0 at delay=20 to 0.5 at delay=100
42
+ return 1.0 - 0.5 * (delay - 20) / 80.0
43
+ elif delay <= 420:
44
+ return 0.2
45
+ else:
46
+ return 0.0
47
+
48
+
49
+ def grade_task_medium(history: List[Dict[str, Any]], attack_start_step: int) -> float:
50
+ """
51
+ Task 2 — Multi-Attack Classification (Medium).
52
+
53
+ Grader logic:
54
+ base_score = fraction of steps (after attack_start) where attack_type is correctly classified
55
+ early_bonus = 0.4 * max(0, 1 - first_correct_classification_step / 100)
56
+ score = min(1.0, base_score * 0.6 + early_bonus)
57
+ """
58
+ steps_after_attack = 0
59
+ correct_classifications = 0
60
+ first_correct_classification_step = None
61
+
62
+ for entry in history:
63
+ step = entry["step"]
64
+ if step < attack_start_step:
65
+ continue
66
+
67
+ steps_after_attack += 1
68
+ true_type = entry["true_attack_type"]
69
+ agent_type = entry["agent_attack_type"]
70
+
71
+ if agent_type == true_type:
72
+ correct_classifications += 1
73
+ if first_correct_classification_step is None:
74
+ first_correct_classification_step = step
75
+
76
+ if steps_after_attack == 0:
77
+ return 0.0
78
+
79
+ base_score = correct_classifications / steps_after_attack
80
+
81
+ if first_correct_classification_step is not None:
82
+ early_bonus = 0.4 * max(0.0, 1.0 - first_correct_classification_step / 100.0)
83
+ else:
84
+ early_bonus = 0.0
85
+
86
+ score = min(1.0, base_score * 0.6 + early_bonus)
87
+ return max(0.0, score)
88
+
89
+
90
+ def grade_task_hard(
91
+ history: List[Dict[str, Any]],
92
+ loss_of_lock_step: Optional[int],
93
+ attack_start_step: int,
94
+ ) -> float:
95
+ """
96
+ Task 3 — Stealthy Low-and-Slow Attack (Hard).
97
+
98
+ Grader logic:
99
+ if detected before loss_of_lock_step:
100
+ score = 1.0 * (1 - first_detection_step / loss_of_lock_step)
101
+ elif detected after loss_of_lock but before episode end:
102
+ score = 0.3
103
+ else (never detected):
104
+ score = 0.0
105
+ false_alarm_penalty = 0.2 per false alarm before attack starts
106
+ (capped at reducing score to 0.0 minimum)
107
+ """
108
+ first_detection_step = None
109
+ false_alarm_count = 0
110
+
111
+ for entry in history:
112
+ step = entry["step"]
113
+ attack_active = entry["attack_active"]
114
+ attack_detected = entry["attack_detected"]
115
+
116
+ # Only count false alarms before the attack starts
117
+ if attack_detected and not attack_active and step < attack_start_step:
118
+ false_alarm_count += 1
119
+
120
+ if attack_detected and attack_active and first_detection_step is None:
121
+ first_detection_step = step
122
+
123
+ # Compute base score
124
+ if first_detection_step is None:
125
+ score = 0.0
126
+ elif loss_of_lock_step is not None and first_detection_step < loss_of_lock_step:
127
+ score = 1.0 * (1.0 - first_detection_step / loss_of_lock_step)
128
+ elif loss_of_lock_step is not None and first_detection_step >= loss_of_lock_step:
129
+ score = 0.3
130
+ else:
131
+ # No loss of lock occurred but attack was detected
132
+ score = 0.3
133
+
134
+ # Apply false alarm penalty
135
+ penalty = 0.2 * false_alarm_count
136
+ score = max(0.0, score - penalty)
137
+
138
+ return min(1.0, score)
src/models.py ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Pydantic models for the PLL Cyberattack Detection OpenEnv.
3
+ Defines Observation, Action, Reward, and State schemas.
4
+ """
5
+ import numpy as np
6
+ from typing import Annotated, Any, Dict, List, Optional
7
+ from pydantic import BaseModel, Field, model_validator
8
+
9
+ # Exactly 20 floats — enforced at validation time, not just documented.
10
+ WindowList = Annotated[List[float], Field(min_length=20, max_length=20)]
11
+
12
+ # Exactly 3 floats for [va, vb, vc].
13
+ VoltageList = Annotated[List[float], Field(min_length=3, max_length=3)]
14
+
15
+ class Observation(BaseModel):
16
+ vq_window: WindowList
17
+ vd_window: WindowList
18
+ omega_window: WindowList
19
+ omega_deviation_window: WindowList
20
+ raw_voltages: VoltageList
21
+ task_id: int = Field(ge=0, le=2)
22
+ step: int = Field(ge=0)
23
+
24
+ class Action(BaseModel):
25
+ attack_detected: bool
26
+ attack_type: int = Field(ge=0, le=4)
27
+ confidence: float = Field(ge=0.0, le=1.0)
28
+ protective_action: int = Field(ge=0, le=3)
29
+
30
+ class Reward(BaseModel):
31
+ total: float
32
+ detection_reward: float
33
+ classification_bonus: float
34
+ early_detection_bonus: float
35
+ false_alarm_penalty: float
36
+ lock_loss_penalty: float
37
+
38
+ class State(BaseModel):
39
+ theta_true: float
40
+ theta_hat: float
41
+ omega_hat: float
42
+ vq_integral: float
43
+ attack_active: bool
44
+ attack_type: int # Integer ID of the current attack: 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy.
45
+ attack_params: Dict[str, Any]
46
+ attack_start_step: int
47
+ lock_lost: bool # Whether the PLL has lost lock (|theta_err| > 5°). Task 2 only.
48
+ step: int = Field(ge=0)
49
+ episode_id: str
50
+ task_id: int = Field(ge=0, le=2)
51
+
52
+ @model_validator(mode="before")
53
+ @classmethod
54
+ def coerce_attack_params(cls, values: Dict[str, Any]) -> Dict[str, Any]:
55
+ """
56
+ Coerce numpy scalar types inside attack_params to native Python types.
57
+ sample_*_params() casts with float()/int() but a future contributor
58
+ may forget. This validator ensures JSON serialization never fails due
59
+ to np.float32 / np.int64 / np.bool_ leaking into the params dict.
60
+ """
61
+ params = values.get("attack_params", {})
62
+ if isinstance(params, dict):
63
+ coerced = {}
64
+ for k, v in params.items():
65
+ if isinstance(v, np.floating):
66
+ coerced[k] = float(v)
67
+ elif isinstance(v, np.integer):
68
+ coerced[k] = int(v)
69
+ elif isinstance(v, np.bool_):
70
+ coerced[k] = bool(v)
71
+ else:
72
+ coerced[k] = v
73
+ values["attack_params"] = coerced
74
+ return values
src/pll_sim.py ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SRF-PLL Discrete-Time Simulation.
3
+
4
+ Implements the Synchronous Reference Frame Phase-Locked Loop used in
5
+ grid-connected inverters. Discrete time step Δt = 1 ms.
6
+
7
+ Steps:
8
+ 1. Generate true 3-phase grid voltages (50 Hz, 1.0 pu)
9
+ 2. Apply attack injection on va
10
+ 3. Clarke transform (αβ)
11
+ 4. Park transform (dq) using estimated angle θ̂
12
+ 5. PI controller to update ω̂ and θ̂
13
+ 6. Compute phase error
14
+ """
15
+
16
+ import numpy as np
17
+ import math
18
+
19
+
20
+ # Constants
21
+ V_NOM = 1.0 # Nominal voltage (pu)
22
+ F0 = 50.0 # Grid frequency (Hz)
23
+ OMEGA0 = 2.0 * math.pi * F0 # Nominal angular freq (rad/s)
24
+ DT = 1e-3 # Time step (1 ms)
25
+ KP = 50.0 # PI proportional gain
26
+ KI = 1500.0 # PI integral gain
27
+
28
+
29
+ def wrap_angle(angle: float) -> float:
30
+ """Wrap angle to [-π, π]."""
31
+ return (angle + math.pi) % (2.0 * math.pi) - math.pi
32
+
33
+
34
+ class SRFPLLSimulator:
35
+ """Discrete-time SRF-PLL simulator."""
36
+
37
+ def __init__(self):
38
+ self.reset()
39
+
40
+ def reset(self):
41
+ """Reset PLL state to initial conditions."""
42
+ self.t = 0.0 # Simulation time (s)
43
+ self.theta_true = 0.0 # True grid angle (rad)
44
+ self.theta_hat = 0.0 # Estimated angle (rad)
45
+ self.omega_hat = OMEGA0 # Estimated angular freq (rad/s)
46
+ self.vq_integral = 0.0 # Integral of vq for PI controller
47
+
48
+ # Current signal values
49
+ self.vd = 0.0
50
+ self.vq = 0.0
51
+ self.va_m = 0.0
52
+ self.vb_m = 0.0
53
+ self.vc_m = 0.0
54
+ self.theta_err = 0.0
55
+
56
+ def step(self, attack_signal: float = 0.0):
57
+ """
58
+ Advance the PLL by one time step.
59
+
60
+ Args:
61
+ attack_signal: Attack injection added to va (pu).
62
+
63
+ Returns:
64
+ dict with vd, vq, omega_hat, theta_err, va_m, vb_m, vc_m, theta_true, theta_hat
65
+ """
66
+ # Step 1 — True three-phase grid voltages
67
+ va = V_NOM * math.sin(self.theta_true)
68
+ vb = V_NOM * math.sin(self.theta_true - 2.0 * math.pi / 3.0)
69
+ vc = V_NOM * math.sin(self.theta_true + 2.0 * math.pi / 3.0)
70
+
71
+ # Step 2 — Apply attack injection on va
72
+ va_m = va + attack_signal
73
+ vb_m = vb
74
+ vc_m = vc
75
+
76
+ # Step 3 — Clarke Transform (αβ)
77
+ v_alpha = va_m
78
+ v_beta = (va_m + 2.0 * vb_m) / math.sqrt(3.0)
79
+
80
+ # Step 4 — Park Transform (dq) using estimated angle θ̂
81
+ cos_th = math.cos(self.theta_hat)
82
+ sin_th = math.sin(self.theta_hat)
83
+ vd = v_alpha * cos_th + v_beta * sin_th
84
+ vq = -v_alpha * sin_th + v_beta * cos_th
85
+
86
+ # Step 5 — PI Controller
87
+ self.vq_integral += vq * DT
88
+ omega_hat = OMEGA0 + KP * vq + KI * self.vq_integral
89
+ self.theta_hat += omega_hat * DT
90
+
91
+ # Advance true angle
92
+ self.theta_true += OMEGA0 * DT
93
+
94
+ # Step 6 — Phase error wrapped to [-π, π]
95
+ theta_err = wrap_angle(self.theta_hat - self.theta_true)
96
+
97
+ # Update time
98
+ self.t += DT
99
+
100
+ # Store current values
101
+ self.vd = vd
102
+ self.vq = vq
103
+ self.omega_hat = omega_hat
104
+ self.va_m = va_m
105
+ self.vb_m = vb_m
106
+ self.vc_m = vc_m
107
+ self.theta_err = theta_err
108
+
109
+ return {
110
+ "vd": vd,
111
+ "vq": vq,
112
+ "omega_hat": omega_hat,
113
+ "theta_err": theta_err,
114
+ "va_m": va_m,
115
+ "vb_m": vb_m,
116
+ "vc_m": vc_m,
117
+ "theta_true": self.theta_true,
118
+ "theta_hat": self.theta_hat,
119
+ }
uv.lock ADDED
The diff for this file is too large to render. See raw diff