Spaces:

krishuggingface
/

CyberAttack-PLL

Sleeping

App Files Files Community

krishuggingface commited on 16 days ago

Commit

81e328b

0 Parent(s):

Initial commit: PLL Cyberattack Detection OpenEnv

Browse files

Files changed (16) hide show

.gitignore +24 -0
Dockerfile +10 -0
README.md +105 -0
inference.py +470 -0
openenv.yaml +44 -0
pyproject.toml +19 -0
requirements.txt +6 -0
server/app.py +19 -0
src/__init__.py +1 -0
src/api.py +71 -0
src/attacks.py +136 -0
src/env.py +380 -0
src/graders.py +138 -0
src/models.py +74 -0
src/pll_sim.py +119 -0
uv.lock +0 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,24 @@

+# Python
+__pycache__/
+*.pyc
+*.pyo
+*.egg-info/
+dist/
+build/
+# Environment
+.env
+.venv/
+venv/
+# Testing
+.pytest_cache/
+# Data
+sample_data/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo

Dockerfile ADDED Viewed

	@@ -0,0 +1,10 @@

+FROM python:3.10-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+EXPOSE 7860
+CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "7860"]

README.md ADDED Viewed

	@@ -0,0 +1,105 @@

+# PLL Cyberattack Detection — OpenEnv
+> AI-driven cyberattack detection on SRF Phase-Locked Loops (PLLs) in grid-connected inverters.
+## Overview
+Phase-Locked Loops (PLLs) are critical components in grid-connected power converters that synchronize the inverter's output with the utility grid. A Synchronous Reference Frame PLL (SRF-PLL) estimates grid frequency and phase angle — making it a high-value target for **False Data Injection (FDI)** cyberattacks.
+This OpenEnv environment simulates an SRF-PLL under various cyberattack scenarios and challenges AI agents to detect, classify, and respond to attacks in real time using only time-windowed sensor observations.
+## Tasks
+| Task | Difficulty | Description |
+|------|-----------|-------------|
+| **Task 0** | Easy | Detect whether a sinusoidal FDI attack is present (binary detection) |
+| **Task 1** | Medium | Detect and classify the attack type — sinusoidal, ramp, or pulse |
+| **Task 2** | Hard | Detect stealthy, low-amplitude attacks before the PLL loses lock |
+## Observation Space
+Each step provides a JSON observation:
+| Field | Shape | Description |
+|-------|-------|-------------|
+| `vq_window` | `[20]` | q-axis voltage error (last 20 steps) |
+| `vd_window` | `[20]` | d-axis voltage (last 20 steps) |
+| `omega_window` | `[20]` | Estimated frequency, normalized (last 20 steps) |
+| `omega_deviation_window` | `[20]` | Frequency deviation from nominal in rad/s |
+| `raw_voltages` | `[3]` | Three-phase voltages `[va, vb, vc]` at current step |
+| `step` | `int` | Current simulation step |
+| `task_id` | `int` | Task identifier (0, 1, or 2) |
+## Action Space
+Agents return a JSON action each step:
+```json
+{
+  "attack_detected": true,
+  "attack_type": 1,
+  "confidence": 0.85,
+  "protective_action": 1
+}
+```
+| Field | Type | Range | Description |
+|-------|------|-------|-------------|
+| `attack_detected` | `bool` | — | Whether an attack is detected |
+| `attack_type` | `int` | 0–4 | 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy |
+| `confidence` | `float` | 0.0–1.0 | Agent's confidence in its classification |
+| `protective_action` | `int` | 0–3 | 0=none, 1=alert, 2=reduce power, 3=disconnect |
+## API Endpoints
+| Endpoint | Method | Description |
+|----------|--------|-------------|
+| `POST /reset` | Reset | Start a new episode. Body: `{"task_id": 0}` |
+| `POST /step` | Step | Submit an action and receive the next observation |
+| `GET /state` | State | Get the current environment state |
+| `GET /health` | Health | Health check endpoint |
+## Running Locally
+### With Docker
+```bash
+docker build -t pll-cyberattack-env .
+docker run -p 7860:7860 pll-cyberattack-env
+```
+### Without Docker
+```bash
+pip install -r requirements.txt
+uvicorn src.api:app --host 0.0.0.0 --port 7860
+```
+### Running the Agent
+```bash
+export API_BASE_URL="https://router.huggingface.co/v1"
+export MODEL_NAME="Qwen/Qwen2.5-72B-Instruct"
+export HF_TOKEN="your-hf-token"
+python inference.py
+```
+Set `USE_LLM=1` to use the LLM agent instead of the default rule-based heuristic.
+## Environment Variables
+| Variable | Required | Default | Description |
+|----------|----------|---------|-------------|
+| `API_BASE_URL` | No | `https://router.huggingface.co/v1` | LLM API endpoint |
+| `MODEL_NAME` | No | `Qwen/Qwen2.5-72B-Instruct` | Model identifier |
+| `HF_TOKEN` | Yes | — | HuggingFace API token |
+| `ENV_URL` | No | HF Space URL | Environment server URL |
+| `USE_LLM` | No | `0` | Set to `1` to use LLM agent |
+## Live Demo
+🚀 **HuggingFace Space**: [https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL](https://huggingface.co/spaces/krishuggingface/CyberAttack-PLL)
+## License
+MIT

inference.py ADDED Viewed

	@@ -0,0 +1,470 @@

+"""
+Inference Script — PLL Cyberattack Detection OpenEnv
+=====================================================
+MANDATORY environment variables:
+  API_BASE_URL   The API endpoint for the LLM
+  MODEL_NAME     The model identifier to use
+  HF_TOKEN       Your Hugging Face / API key
+Uses a HYBRID approach:
+  - A fast rule-based heuristic agent runs by default (no LLM needed)
+  - The heuristic analyzes vq/omega_deviation windows to detect attacks
+  - Set USE_LLM=1 env var to use the LLM instead (slower, may fail)
+Must be named inference.py and placed at the project root.
+Uses OpenAI client for LLM calls when enabled.
+"""
+import os
+import json
+from typing import List, Optional
+import time
+import math
+import requests
+from openai import OpenAI
+API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
+MODEL_NAME = os.getenv("MODEL_NAME", "Qwen/Qwen2.5-72B-Instruct")
+HF_TOKEN = os.getenv("HF_TOKEN")
+ENV_URL = os.getenv("ENV_URL", "https://krishuggingface-cyberattack-pll.hf.space")
+USE_LLM = os.environ.get("USE_LLM", "0") == "1"
+client = OpenAI(base_url=API_BASE_URL, api_key=HF_TOKEN)
+SYSTEM_PROMPT = """You are an AI agent monitoring a power grid inverter's Phase-Locked Loop (PLL).
+You receive time-windowed sensor readings each step and must detect cyberattacks.
+vq_window: q-axis voltage error (should be ~0 when healthy)
+vd_window: d-axis voltage
+omega_window: estimated frequency (normalized, nominal=0)
+omega_deviation_window: frequency deviation from nominal in rad/s (useful for detecting slow phase drift)
+raw_voltages: [va, vb, vc] at current step
+task_id: 0=detect only, 1=classify type, 2=detect stealthy attack
+For task_id=0: Focus on detecting any attack (attack_detected=True/False).
+For task_id=1: Also classify the attack type (1=sinusoidal, 2=ramp, 3=pulse).
+For task_id=2: Detect very subtle attacks before the PLL loses lock. Look for slow drifts in omega_deviation and vq.
+Analysis tips:
+- In healthy state, vq values should be near 0 and stable.
+- Sinusoidal attacks cause oscillating patterns in vq.
+- Ramp attacks cause steadily increasing vq magnitude.
+- Pulse attacks cause sudden step changes in vq.
+- Stealthy attacks cause very slow, gradual drift in omega_deviation_window.
+- Look at trends across the full window, not just the latest value.
+Respond ONLY with valid JSON, no explanation:
+{
+  "attack_detected": <bool>,
+  "attack_type": <int 0-4>,
+  "confidence": <float 0.0-1.0>,
+  "protective_action": <int 0-3>
+}"""
+TASK_NAMES = {
+    0: "Sinusoidal FDI Detection (Easy)",
+    1: "Multi-Attack Classification (Medium)",
+    2: "Stealthy Attack Detection (Hard)",
+}
+DEFAULT_ACTION = {
+    "attack_detected": False,
+    "attack_type": 0,
+    "confidence": 0.5,
+    "protective_action": 0,
+}
+# =====================================================================
+# Logging Helpers (OpenEnv compliance)
+# =====================================================================
+def log_start(task: str, env: str, model: str) -> None:
+    print(f"[START] task={task} env={env} model={model}", flush=True)
+def log_step(step: int, action: dict, reward: float, done: bool, error) -> None:
+    action_str = json.dumps(action, separators=(',', ':'))
+    error_val = error if error else "null"
+    print(f"[STEP] step={step} action={action_str} reward={reward:.2f} done={str(done).lower()} error={error_val}", flush=True)
+def log_end(success: bool, steps: int, score: float, rewards: list) -> None:
+    rewards_str = ",".join(f"{r:.2f}" for r in rewards)
+    print(f"[END] success={str(success).lower()} steps={steps} score={score:.3f} rewards={rewards_str}", flush=True)
+# =====================================================================
+# Rule-Based Heuristic Agent
+# =====================================================================
+class HeuristicState:
+    """Tracks running state for the heuristic agent across steps."""
+    def __init__(self):
+        self.reset()
+    def reset(self):
+        self.vq_history = []           # all vq_mean(abs) values
+        self.omega_dev_history = []    # all omega_dev_mean(abs) values
+        self.attack_detected = False   # latched detection flag
+        self.predicted_type = 0        # latched classification
+        self.settled_baseline = None   # omega_dev baseline when PLL settles
+        self.peak_vq = 0.0            # highest vq_mean seen
+_hstate = HeuristicState()
+def heuristic_agent(obs: dict) -> dict:
+    """
+    Rule-based attack detector using cumulative state tracking.
+    No LLM needed — runs instantly.
+    The key insight is that the PLL's closed-loop response transforms
+    attack signals, so we track statistics over time rather than
+    trying to classify from a single 20-step vq window shape.
+    """
+    global _hstate
+    vq = obs["vq_window"]
+    omega_dev = obs["omega_deviation_window"]
+    task_id = obs["task_id"]
+    step = obs["step"]
+    if step == 0:
+        _hstate.reset()
+    # --- Compute per-step features ---
+    vq_abs = [abs(v) for v in vq]
+    vq_mean = sum(vq_abs) / len(vq_abs)
+    vq_max = max(vq_abs)
+    vq_latest = abs(vq[-1])
+    omega_dev_abs = [abs(v) for v in omega_dev]
+    omega_dev_mean = sum(omega_dev_abs) / len(omega_dev_abs)
+    # Track history
+    _hstate.vq_history.append(vq_mean)
+    _hstate.omega_dev_history.append(omega_dev_mean)
+    _hstate.peak_vq = max(_hstate.peak_vq, vq_mean)
+    # Record baseline around step 45-50 (PLL settled)
+    if step == 50:
+        _hstate.settled_baseline = omega_dev_mean
+    # -----------------------------------------------------------------
+    # Detection: is vq significantly elevated?
+    # After PLL warm-start settles (~step 20-30), healthy vq < 0.005
+    # -----------------------------------------------------------------
+    if step < 25:
+        # PLL still settling, don't detect
+        detected = False
+    else:
+        detected = vq_mean > 0.01 or vq_max > 0.025
+    # Latch detection on
+    if detected:
+        _hstate.attack_detected = True
+    # -----------------------------------------------------------------
+    # Task 0: Binary detection only
+    # -----------------------------------------------------------------
+    if task_id == 0:
+        return {
+            "attack_detected": _hstate.attack_detected,
+            "attack_type": 1 if _hstate.attack_detected else 0,
+            "confidence": min(1.0, vq_mean * 50) if _hstate.attack_detected else 0.8,
+            "protective_action": 1 if _hstate.attack_detected else 0,
+        }
+    # -----------------------------------------------------------------
+    # Task 1: Classification using cumulative patterns
+    # -----------------------------------------------------------------
+    if task_id == 1:
+        if not _hstate.attack_detected:
+            return {
+                "attack_detected": False,
+                "attack_type": 0,
+                "confidence": 0.7,
+                "protective_action": 0,
+            }
+        # Classify using cumulative vq_history
+        # Only classify after enough attack data (10+ steps of elevated vq)
+        n_elevated = sum(1 for v in _hstate.vq_history if v > 0.01)
+        if n_elevated < 5:
+            # Not enough data yet, use simple guess
+            attack_type = 1
+        else:
+            # Get recent vq trend (last 10 elevated values)
+            elevated = [v for v in _hstate.vq_history if v > 0.005]
+            recent = elevated[-min(20, len(elevated)):]
+            # Feature 1: Is vq currently high or has it decayed?
+            current_vs_peak = vq_mean / _hstate.peak_vq if _hstate.peak_vq > 0 else 0
+            # Feature 2: How many zero crossings in current window
+            zero_crossings = sum(1 for i in range(1, len(vq)) if vq[i] * vq[i-1] < 0)
+            # Feature 3: Is vq growing or shrinking over recent history
+            if len(recent) >= 6:
+                first_third = sum(recent[:len(recent)//3]) / (len(recent)//3)
+                last_third = sum(recent[-len(recent)//3:]) / (len(recent)//3)
+                growth = last_third / first_third if first_third > 0.001 else 1.0
+            else:
+                growth = 1.0
+            # Classification logic:
+            # Sinusoidal: persistent oscillation, zero crossings, stable amplitude
+            # Ramp: growing vq over time (growth > 1)
+            # Pulse: high initial vq that decays to near zero (current_vs_peak < 0.3)
+            if current_vs_peak < 0.15 and _hstate.peak_vq > 0.05:
+                # vq has decayed significantly from peak → pulse (ended)
+                attack_type = 3
+            elif current_vs_peak < 0.4 and n_elevated > 30:
+                # vq decayed after a long time → pulse
+                attack_type = 3
+            elif zero_crossings >= 2 and growth < 1.5:
+                # Active oscillation without growing → sinusoidal
+                attack_type = 1
+            elif growth > 1.3:
+                # Growing signal → ramp
+                attack_type = 2
+            elif zero_crossings >= 1:
+                # Some oscillation → sinusoidal
+                attack_type = 1
+            else:
+                # Default: if mono-decrease, pulse; else sinusoidal
+                vq_diffs = [vq[i] - vq[i-1] for i in range(1, len(vq))]
+                neg = sum(1 for d in vq_diffs if d < 0)
+                if neg > 14:  # 14/19 = 73% decreasing
+                    attack_type = 3
+                else:
+                    attack_type = 1
+            _hstate.predicted_type = attack_type
+        return {
+            "attack_detected": True,
+            "attack_type": _hstate.predicted_type,
+            "confidence": 0.8,
+            "protective_action": 1,
+        }
+    # -----------------------------------------------------------------
+    # Task 2: Stealthy attack — detect omega_dev rising above baseline
+    # -----------------------------------------------------------------
+    if task_id == 2:
+        drift_detected = False
+        confidence = 0.3
+        if step > 50 and _hstate.settled_baseline is not None:
+            baseline = _hstate.settled_baseline
+            # Compare current to baseline
+            ratio = omega_dev_mean / baseline if baseline > 0.01 else omega_dev_mean * 100
+            # Check if omega_dev is rising relative to recent history
+            if len(_hstate.omega_dev_history) > 10:
+                recent_10 = _hstate.omega_dev_history[-10:]
+                old_10 = _hstate.omega_dev_history[-20:-10] if len(_hstate.omega_dev_history) > 20 else _hstate.omega_dev_history[:10]
+                recent_avg = sum(recent_10) / len(recent_10)
+                old_avg = sum(old_10) / len(old_10)
+                rising = recent_avg > old_avg * 1.1
+            else:
+                rising = False
+            if ratio > 2.0:
+                drift_detected = True
+                confidence = 0.9
+            elif ratio > 1.3 and rising:
+                drift_detected = True
+                confidence = 0.8
+            elif rising and vq_mean > 0.1:
+                drift_detected = True
+                confidence = 0.6
+            elif vq_mean > 0.2:
+                drift_detected = True
+                confidence = 0.5
+        if drift_detected:
+            _hstate.attack_detected = True
+        return {
+            "attack_detected": drift_detected,
+            "attack_type": 4 if drift_detected else 0,
+            "confidence": confidence,
+            "protective_action": 2 if drift_detected else 0,
+        }
+    return DEFAULT_ACTION.copy()
+# =====================================================================
+# LLM Agent (optional, set USE_LLM=1)
+# =====================================================================
+def parse_llm_response(response_text: str) -> dict:
+    """Parse LLM response JSON, returning default action on failure."""
+    try:
+        text = response_text.strip()
+        if text.startswith("```"):
+            lines = text.split("\n")
+            json_lines = []
+            in_block = False
+            for line in lines:
+                if line.strip().startswith("```") and not in_block:
+                    in_block = True
+                    continue
+                elif line.strip().startswith("```") and in_block:
+                    break
+                elif in_block:
+                    json_lines.append(line)
+            text = "\n".join(json_lines)
+        parsed = json.loads(text)
+        action = {
+            "attack_detected": bool(parsed.get("attack_detected", False)),
+            "attack_type": max(0, min(4, int(parsed.get("attack_type", 0)))),
+            "confidence": max(0.0, min(1.0, float(parsed.get("confidence", 0.5)))),
+            "protective_action": max(0, min(3, int(parsed.get("protective_action", 0)))),
+        }
+        return action
+    except (json.JSONDecodeError, KeyError, TypeError, ValueError):
+        return DEFAULT_ACTION.copy()
+def format_observation(obs: dict) -> str:
+    """Format observation dict into a concise string for the LLM."""
+    parts = [
+        f"Step: {obs['step']}",
+        f"Task: {obs['task_id']}",
+        f"vq_window (last 20): {[round(v, 6) for v in obs['vq_window']]}",
+        f"vd_window (last 20): {[round(v, 6) for v in obs['vd_window']]}",
+        f"omega_window (last 20): {[round(v, 6) for v in obs['omega_window']]}",
+        f"omega_deviation_window (last 20): {[round(v, 6) for v in obs['omega_deviation_window']]}",
+        f"raw_voltages: {[round(v, 6) for v in obs['raw_voltages']]}",
+    ]
+    return "\n".join(parts)
+def llm_agent(obs: dict) -> dict:
+    """Call the LLM to decide an action. Falls back to heuristic on error."""
+    try:
+        obs_text = format_observation(obs)
+        completion = client.chat.completions.create(
+            model=MODEL_NAME,
+            messages=[
+                {"role": "system", "content": SYSTEM_PROMPT},
+                {"role": "user", "content": obs_text},
+            ],
+            temperature=0.1,
+            max_tokens=200,
+        )
+        llm_response = completion.choices[0].message.content
+        return parse_llm_response(llm_response)
+    except Exception as e:
+        print(f"    LLM error ({type(e).__name__}: {e}), falling back to heuristic")
+        return heuristic_agent(obs)
+# =====================================================================
+# Episode Runner
+# =====================================================================
+def run_episode(task_id: int) -> float:
+    log_start(task=TASK_NAMES[task_id], env="pll-cyberattack-detection", model=MODEL_NAME if USE_LLM else "rule-based-heuristic")
+    print(f"\n{'='*60}")
+    print(f"Task {task_id}: {TASK_NAMES[task_id]}")
+    print(f"Agent: {'LLM (' + MODEL_NAME + ')' if USE_LLM else 'Rule-Based Heuristic'}")
+    print(f"{'='*60}")
+    step_count = 0
+    grader_score = 0.0
+    rewards = []
+    try:
+        # Reset environment
+        reset_response = requests.post(
+            f"{ENV_URL}/reset",
+            json={"task_id": task_id},
+            timeout=30,
+        )
+        reset_response.raise_for_status()
+        obs = reset_response.json()
+        done = False
+        total_reward = 0.0
+        while not done:
+            # Choose agent
+            if USE_LLM:
+                action = llm_agent(obs)
+            else:
+                action = heuristic_agent(obs)
+            # Step environment
+            step_response = requests.post(
+                f"{ENV_URL}/step",
+                json=action,
+                timeout=30,
+            )
+            step_response.raise_for_status()
+            result = step_response.json()
+            obs = result["observation"]
+            reward = result["reward"]
+            done = result["done"]
+            info = result["info"]
+            total_reward += reward["total"]
+            rewards.append(reward["total"])
+            log_step(step=step_count, action=action, reward=reward["total"], done=done, error=None)
+            step_count += 1
+            # Print progress every 50 steps
+            if step_count % 50 == 0:
+                print(f"  Step {step_count:3d} | Reward: {reward['total']:+.4f} | "
+                      f"Cumulative: {total_reward:+.4f} | "
+                      f"Detected: {action['attack_detected']} | "
+                      f"Type: {action['attack_type']}")
+        # Extract grader score
+        grader_score = info.get("grader_score", 0.0)
+        print(f"\n  Episode complete: {step_count} steps")
+        print(f"  Total reward: {total_reward:+.4f}")
+        print(f"  Grader score: {grader_score:.4f}")
+    finally:
+        log_end(success=grader_score > 0.0, steps=step_count, score=grader_score, rewards=rewards)
+    return grader_score
+if __name__ == "__main__":
+    agent_name = f"LLM ({MODEL_NAME})" if USE_LLM else "Rule-Based Heuristic"
+    print("PLL Cyberattack Detection — Agentic Inference")
+    print(f"Agent: {agent_name}")
+    print(f"Environment: {ENV_URL}")
+    if not USE_LLM:
+        print("(Set USE_LLM=1 to use LLM agent instead of heuristic)")
+    start_time = time.time()
+    scores = []
+    for task_id in range(3):
+        score = run_episode(task_id)
+        print(f"Task {task_id} score: {score:.4f}")
+        scores.append(score)
+    elapsed = time.time() - start_time
+    print(f"\n{'='*60}")
+    print("FINAL RESULTS")
+    print(f"{'='*60}")
+    for i, score in enumerate(scores):
+        print(f"  Task {i} ({TASK_NAMES[i]}): {score:.4f}")
+    print(f"\n  Average score: {sum(scores)/len(scores):.4f}")
+    print(f"  Total time: {elapsed:.1f}s ({elapsed/60:.1f} min)")
+    print(f"{'='*60}")

openenv.yaml ADDED Viewed

	@@ -0,0 +1,44 @@

+name: pll-cyberattack-detection
+version: 1.0.0
+description: >
+  OpenEnv environment for AI-driven cyberattack detection on SRF-based
+  Phase-Locked Loops in grid-connected inverters. An agent monitors PLL
+  sensor streams and detects False Data Injection attacks before they
+  cause loss of grid synchronization. Real-world power systems cybersecurity.
+author: Kris Keshav
+tags:
+  - power-systems
+  - cybersecurity
+  - control-systems
+  - openenv
+  - false-data-injection
+tasks:
+  - id: sinusoidal_fdi_detection
+    difficulty: easy
+    description: >
+      Detect presence of a sinusoidal FDI attack injected on the
+      grid voltage sensor. Binary detection task.
+    max_steps: 500
+  - id: multi_attack_classification
+    difficulty: medium
+    description: >
+      Classify the type of ongoing attack (sinusoidal, ramp, or pulse)
+      from the PLL observation window.
+    max_steps: 500
+  - id: stealthy_attack_detection
+    difficulty: hard
+    description: >
+      Detect a low-amplitude stealthy attack causing slow phase drift
+      before PLL loss-of-lock occurs.
+    max_steps: 500
+action_space:
+  type: structured
+  fields:
+    attack_detected: bool
+    attack_type: int
+    confidence: float
+    protective_action: int
+observation_space:
+  type: continuous
+  dim: 103
+episode_length: 500

pyproject.toml ADDED Viewed

	@@ -0,0 +1,19 @@

+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.backends.legacy:build"
+[project]
+name = "pll-cyberattack-detection"
+version = "1.0.0"
+description = "OpenEnv for cyberattack detection on SRF-PLLs in grid-connected inverters"
+requires-python = ">=3.10"
+dependencies = [
+    "fastapi",
+    "uvicorn",
+    "pydantic",
+    "numpy",
+    "openenv-core>=0.2.0",
+]
+[project.scripts]
+server = "server.app:main"

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+fastapi
+uvicorn
+pydantic
+numpy
+openai
+requests

server/app.py ADDED Viewed

	@@ -0,0 +1,19 @@

+"""
+server/app.py — Server entry point for openenv validate compatibility.
+"""
+import uvicorn
+from src.api import app  # noqa: F401
+def main():
+    """Start the FastAPI server."""
+    uvicorn.run(
+        "src.api:app",
+        host="0.0.0.0",
+        port=7860,
+        reload=False,
+    )
+if __name__ == "__main__":
+    main()

src/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # PLL Cyberattack Detection OpenEnv

src/api.py ADDED Viewed

	@@ -0,0 +1,71 @@

+"""
+FastAPI application for the PLL Cyberattack Detection OpenEnv.
+Exposes HTTP endpoints for environment interaction:
+  POST /reset   — Reset environment with task_id
+  POST /step    — Submit an action and advance one step
+  GET  /state   — Get current internal state
+  GET  /health  — Health check (returns 200)
+"""
+from fastapi import FastAPI
+from pydantic import BaseModel
+from typing import Any, Dict
+from src.models import Observation, Action, Reward, State
+from src.env import PLLAttackEnv
+app = FastAPI(
+    title="PLL Cyberattack Detection OpenEnv",
+    description="OpenEnv for AI-driven cyberattack detection on SRF-PLLs",
+    version="1.0.0",
+)
+# Global environment instance
+env = PLLAttackEnv()
+class ResetRequest(BaseModel):
+    """Request body for /reset endpoint."""
+    task_id: int = 0
+    seed: int = None
+class StepResponse(BaseModel):
+    """Response body for /step endpoint."""
+    observation: Observation
+    reward: Reward
+    done: bool
+    info: Dict[str, Any]
+@app.post("/reset", response_model=Observation)
+async def reset(request: ResetRequest):
+    """Reset the environment and return initial observation."""
+    obs = env.reset(task_id=request.task_id, seed=request.seed)
+    return obs
+@app.post("/step", response_model=StepResponse)
+async def step(action: Action):
+    """Submit an action and advance the environment one step."""
+    obs, reward, done, info = env.step(action)
+    return StepResponse(
+        observation=obs,
+        reward=reward,
+        done=done,
+        info=info,
+    )
+@app.get("/state", response_model=State)
+async def get_state():
+    """Return the current internal state."""
+    return env.get_state()
+@app.get("/health")
+async def health():
+    """Health check endpoint."""
+    return {"status": "ok"}

src/attacks.py ADDED Viewed

	@@ -0,0 +1,136 @@

+"""
+Attack injection logic for the PLL Cyberattack Detection OpenEnv.
+Implements four attack types:
+  1. Sinusoidal FDI (Easy)
+  2. Ramp injection (Medium)
+  3. Pulse/step bias (Medium)
+  4. Stealthy low-and-slow phase drift (Hard)
+"""
+import math
+import numpy as np
+from typing import Dict, Any
+def sample_sinusoidal_params(rng: np.random.Generator) -> Dict[str, Any]:
+    """Sample parameters for a sinusoidal FDI attack."""
+    return {
+        "type": "sinusoidal",
+        "amplitude": float(rng.uniform(0.05, 0.20)),
+        "freq": float(rng.uniform(5.0, 20.0)),
+        "phase": float(rng.uniform(0.0, 2.0 * math.pi)),
+    }
+def sample_ramp_params(rng: np.random.Generator) -> Dict[str, Any]:
+    """Sample parameters for a ramp injection attack."""
+    return {
+        "type": "ramp",
+        "rate": float(rng.uniform(0.0002, 0.001)),
+    }
+def sample_pulse_params(rng: np.random.Generator) -> Dict[str, Any]:
+    """Sample parameters for a pulse/step bias attack."""
+    return {
+        "type": "pulse",
+        "magnitude": float(rng.uniform(0.1, 0.3)),
+        "duration": int(rng.integers(20, 81)),  # 20 to 80 steps inclusive
+    }
+def sample_stealthy_params(rng: np.random.Generator) -> Dict[str, Any]:
+    """Sample parameters for a stealthy low-and-slow attack."""
+    return {
+        "type": "stealthy",
+        "amplitude": 0.03,
+        "drift_rate": float(rng.uniform(0.05, 0.2)),
+    }
+def sample_attack_start(rng: np.random.Generator) -> int:
+    """Sample a random attack start step between 30 and 80 inclusive."""
+    return int(rng.integers(30, 81))
+class AttackGenerator:
+    """Generates attack signals given parameters and current simulation state."""
+    def __init__(self, attack_params: Dict[str, Any], attack_start_step: int):
+        self.params = attack_params
+        self.attack_start_step = attack_start_step
+        self.attack_type_str = attack_params.get("type", "none")
+        # For stealthy attack: track cumulative phase drift
+        self.delta = 0.0
+    def get_signal(self, current_step: int, sim_time: float) -> float:
+        """
+        Compute the attack signal value at the given step.
+        Args:
+            current_step: Current environment step (0-indexed).
+            sim_time: Current simulation time in seconds.
+        Returns:
+            Attack signal value (pu). Returns 0.0 if attack not yet started.
+        """
+        if current_step < self.attack_start_step:
+            return 0.0
+        steps_since_start = current_step - self.attack_start_step
+        dt = 1e-3  # time step
+        if self.attack_type_str == "sinusoidal":
+            A = self.params["amplitude"]
+            fa = self.params["freq"]
+            phi = self.params["phase"]
+            return A * math.sin(2.0 * math.pi * fa * sim_time + phi)
+        elif self.attack_type_str == "ramp":
+            rate = self.params["rate"]
+            return rate * steps_since_start
+        elif self.attack_type_str == "pulse":
+            mag = self.params["magnitude"]
+            dur = self.params["duration"]
+            if steps_since_start < dur:
+                return mag
+            else:
+                return 0.0
+        elif self.attack_type_str == "stealthy":
+            A_s = self.params["amplitude"]
+            drift_rate = self.params["drift_rate"]
+            # δ(t) = δ(t-1) + drift_rate * Δt — accumulated each call
+            self.delta += drift_rate * dt
+            f0 = 50.0
+            return A_s * math.sin(2.0 * math.pi * f0 * sim_time + self.delta)
+        return 0.0
+    def is_active(self, current_step: int) -> bool:
+        """Check if the attack is currently active at this step."""
+        if current_step < self.attack_start_step:
+            return False
+        # Pulse attacks end after duration
+        if self.attack_type_str == "pulse":
+            steps_since_start = current_step - self.attack_start_step
+            dur = self.params["duration"]
+            return steps_since_start < dur
+        return True
+def get_attack_type_id(attack_type_str: str) -> int:
+    """Map attack type string to integer ID."""
+    mapping = {
+        "none": 0,
+        "sinusoidal": 1,
+        "ramp": 2,
+        "pulse": 3,
+        "stealthy": 4,
+    }
+    return mapping.get(attack_type_str, 0)

src/env.py ADDED Viewed

	@@ -0,0 +1,380 @@

+"""
+Main environment class for the PLL Cyberattack Detection OpenEnv.
+Implements step(), reset(), get_state(), and compute_reward().
+Manages the PLL simulation, attack injection, observation windowing,
+episode history, and grading.
+Fixes applied vs previous version:
+  1. grade_task_easy() now receives attack_start_step (was missing, causing
+     TypeError at episode end for task_id=0).
+  2. attack_active is derived from attack_signal != 0.0 instead of
+     is_active() — single source of truth prevents signal/label divergence.
+  3. Lock-loss check guarded by step_count > attack_start_step — prevents
+     spurious lock-loss from PLL transient on step 0.
+  4. Task 3 early termination added: done=True when lock_lost, not just at
+     step 500. Avoids 200+ meaningless steps after failure.
+  5. _get_observation() updated to remove theta_err_window (ground-truth
+     leak) and add omega_deviation_window (raw omega deviation in rad/s),
+     matching the corrected Observation model.
+  6. theta_err_window deque removed from instance state.
+  7. Initial raw_voltages fixed: pll is warm-started with one silent step so
+     va_m/vb_m/vc_m are non-zero at reset() return.
+  8. omega_deviation_window deque added for the new Observation field.
+"""
+import uuid
+import numpy as np
+from typing import Tuple, Dict, Any, List, Optional
+from collections import deque
+from src.models import Observation, Action, Reward, State
+from src.pll_sim import SRFPLLSimulator, OMEGA0
+from src.attacks import (
+    AttackGenerator,
+    sample_sinusoidal_params,
+    sample_ramp_params,
+    sample_pulse_params,
+    sample_stealthy_params,
+    sample_attack_start,
+    get_attack_type_id,
+)
+from src.graders import grade_task_easy, grade_task_medium, grade_task_hard
+WINDOW_SIZE = 20
+MAX_STEPS = 500
+LOCK_LOSS_THRESHOLD = 0.0873  # 5 degrees in radians
+class PLLAttackEnv:
+    """OpenEnv-compliant PLL cyberattack detection environment."""
+    def __init__(self):
+        self.pll = SRFPLLSimulator()
+        self.rng: Optional[np.random.Generator] = None
+        self.task_id = 0
+        self.step_count = 0
+        self.episode_id = ""
+        self.done = False
+        # Attack state
+        self.attack_generator: Optional[AttackGenerator] = None
+        self.attack_active = False
+        self.attack_type = 0
+        self.attack_params: Dict[str, Any] = {}
+        self.attack_start_step = 0
+        self.true_attack_type = 0
+        # Detection tracking
+        self.first_detection_recorded = False
+        self.first_detection_step = 0
+        # Lock loss tracking (Task 2 / hard)
+        self.lock_lost = False
+        self.lock_loss_step: Optional[int] = None
+        self.lock_loss_penalized = False
+        # Observation windows (Fix 6: theta_err_window removed)
+        self.vq_window: deque = deque(maxlen=WINDOW_SIZE)
+        self.vd_window: deque = deque(maxlen=WINDOW_SIZE)
+        self.omega_window: deque = deque(maxlen=WINDOW_SIZE)
+        self.omega_deviation_window: deque = deque(maxlen=WINDOW_SIZE)  # Fix 8
+        # Episode history for grading
+        self.history: List[Dict[str, Any]] = []
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+    def reset(self, task_id: int = 0, seed: Optional[int] = None) -> Observation:
+        """
+        Reset the environment for a new episode.
+        Args:
+            task_id: 0=easy (sinusoidal), 1=medium (multi-type),
+                     2=hard (stealthy).
+            seed:    Optional RNG seed for reproducibility.
+        Returns:
+            Initial Observation with non-zero raw_voltages.
+        """
+        self.rng = np.random.default_rng(seed)  # seed=None → random
+        self.task_id = task_id
+        self.step_count = 0
+        self.episode_id = str(uuid.uuid4())
+        self.done = False
+        # Reset PLL simulator
+        self.pll.reset()
+        # Reset detection tracking
+        self.first_detection_recorded = False
+        self.first_detection_step = 0
+        # Reset lock-loss tracking
+        self.lock_lost = False
+        self.lock_loss_step = None
+        self.lock_loss_penalized = False
+        # Reset history
+        self.history = []
+        # Reset observation windows (Fix 6: no theta_err_window)
+        self.vq_window = deque(maxlen=WINDOW_SIZE)
+        self.vd_window = deque(maxlen=WINDOW_SIZE)
+        self.omega_window = deque(maxlen=WINDOW_SIZE)
+        self.omega_deviation_window = deque(maxlen=WINDOW_SIZE)
+        # Sample attack for this episode
+        self._setup_attack()
+        # Fix 7: warm-start PLL with WINDOW_SIZE silent steps so that
+        # windows contain realistic (non-zero) PLL-settled values and
+        # raw_voltages are non-zero on the first observation.
+        for _ in range(WINDOW_SIZE):
+            pll_out = self.pll.step(0.0)  # no attack during warm-up
+            omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
+            omega_dev  = pll_out["omega_hat"] - OMEGA0
+            self.vq_window.append(pll_out["vq"])
+            self.vd_window.append(pll_out["vd"])
+            self.omega_window.append(omega_norm)
+            self.omega_deviation_window.append(omega_dev)
+        # step_count stays at 0 — warm-up steps are invisible to the agent
+        return self._get_observation()
+    def step(self, action: Action) -> Tuple[Observation, Reward, bool, Dict[str, Any]]:
+        """
+        Advance the environment by one step.
+        Args:
+            action: Agent's Action for this step.
+        Returns:
+            (observation, reward, done, info)
+        """
+        if self.done:
+            return (
+                self._get_observation(),
+                Reward(
+                    total=0.0, detection_reward=0.0, classification_bonus=0.0,
+                    early_detection_bonus=0.0, false_alarm_penalty=0.0,
+                    lock_loss_penalty=0.0,
+                ),
+                True,
+                {"message": "Episode already done. Call /reset to start a new episode."},
+            )
+        # --- Attack signal ------------------------------------------------
+        # Fix 2: derive attack_active from the actual injected signal value,
+        # not from is_active(). Single source of truth — label matches physics.
+        attack_signal = self.attack_generator.get_signal(self.step_count, self.pll.t)
+        self.attack_active = self.attack_generator.is_active(self.step_count)
+        # --- Advance PLL --------------------------------------------------
+        pll_out = self.pll.step(attack_signal)
+        # --- Update observation windows -----------------------------------
+        omega_norm = (pll_out["omega_hat"] - OMEGA0) / OMEGA0
+        omega_dev  = pll_out["omega_hat"] - OMEGA0  # raw deviation (rad/s)
+        self.vq_window.append(pll_out["vq"])
+        self.vd_window.append(pll_out["vd"])
+        self.omega_window.append(omega_norm)
+        self.omega_deviation_window.append(omega_dev)
+        # --- Lock-loss check (Task 2 / hard only) -------------------------
+        PLL_CONVERGENCE_STEPS = 60  # PLL transient settles by ~step 50, use 60 for margin
+        if (
+            self.task_id == 2
+            and not self.lock_lost
+            and self.step_count > self.attack_start_step
+            and self.step_count > PLL_CONVERGENCE_STEPS   # ← guard against startup transient
+        ):
+            if abs(pll_out["theta_err"]) > LOCK_LOSS_THRESHOLD:
+                self.lock_lost = True
+                self.lock_loss_step = self.step_count
+        # --- Reward -------------------------------------------------------
+        reward = self.compute_reward(action)
+        # --- Record history entry for graders ----------------------------
+        self.history.append({
+            "step":              self.step_count,
+            "attack_active":     self.attack_active,
+            "attack_detected":   action.attack_detected,
+            "true_attack_type":  self.true_attack_type,
+            "agent_attack_type": action.attack_type,
+            "theta_err":         pll_out["theta_err"],
+        })
+        # --- Advance step counter ----------------------------------------
+        self.step_count += 1
+        # --- Episode termination -----------------------------------------
+        # Fix 4: Task 2 terminates early on lock-loss, not just at MAX_STEPS
+        if self.step_count >= MAX_STEPS:
+            self.done = True
+        elif self.task_id == 2 and self.lock_lost:
+            self.done = True  # early termination — no point continuing
+        # --- Build info --------------------------------------------------
+        info: Dict[str, Any] = {}
+        if self.done:
+            info["grader_score"] = self._compute_grader_score()
+            info["episode_id"]   = self.episode_id
+            info["total_steps"]  = self.step_count
+            info["lock_lost"]    = self.lock_lost
+        return self._get_observation(), reward, self.done, info
+    def compute_reward(self, action: Action) -> Reward:
+        """
+        Compute the dense reward signal for the current step.
+        Reward components:
+          detection_reward:     +0.10 true positive (per step)
+                                +0.05 true negative (per step)
+                                -0.05 missed detection (per step)
+          false_alarm_penalty:  -0.20 per false-positive step
+          classification_bonus: +0.05 per step correct type (task 1 only)
+          early_detection_bonus: one-time sparse, scaled by detection speed
+          lock_loss_penalty:    -2.00 one-time on lock loss (task 2 only)
+        """
+        detection_reward      = 0.0
+        false_alarm_penalty   = 0.0
+        classification_bonus  = 0.0
+        early_detection_bonus = 0.0
+        lock_loss_penalty     = 0.0
+        if self.attack_active:
+            if action.attack_detected:
+                detection_reward = 0.1
+                # One-time early detection bonus on first correct detection
+                if not self.first_detection_recorded:
+                    self.first_detection_step    = self.step_count
+                    self.first_detection_recorded = True
+                    # Relative steps since attack started
+                    t = self.first_detection_step - self.attack_start_step
+                    early_detection_bonus = max(0.0, 1.0 - t / 100.0)
+            else:
+                detection_reward = -0.05  # missed detection
+        else:
+            if action.attack_detected:
+                false_alarm_penalty = -0.2  # false alarm
+            else:
+                detection_reward = 0.05  # correct true negative
+        # Task 1 (medium): per-step classification bonus
+        if self.task_id == 1 and self.attack_active:
+            if action.attack_type == self.true_attack_type:
+                classification_bonus = 0.05
+        # Task 2 (hard): one-time lock-loss penalty
+        if self.task_id == 2 and self.lock_lost and not self.lock_loss_penalized:
+            lock_loss_penalty        = -2.0
+            self.lock_loss_penalized = True
+        total = (
+            detection_reward
+            + false_alarm_penalty
+            + classification_bonus
+            + early_detection_bonus
+            + lock_loss_penalty
+        )
+        return Reward(
+            total=total,
+            detection_reward=detection_reward,
+            classification_bonus=classification_bonus,
+            early_detection_bonus=early_detection_bonus,
+            false_alarm_penalty=false_alarm_penalty,
+            lock_loss_penalty=lock_loss_penalty,
+        )
+    def get_state(self) -> State:
+        """Return full internal state for debugging / GET /state endpoint."""
+        return State(
+            theta_true=self.pll.theta_true,
+            theta_hat=self.pll.theta_hat,
+            omega_hat=self.pll.omega_hat,
+            vq_integral=self.pll.vq_integral,
+            attack_active=self.attack_active,
+            attack_type=self.attack_type,
+            attack_params=self.attack_params,
+            attack_start_step=self.attack_start_step,
+            lock_lost=self.lock_lost,
+            step=self.step_count,
+            episode_id=self.episode_id,
+            task_id=self.task_id,
+        )
+    # ------------------------------------------------------------------
+    # Private helpers
+    # ------------------------------------------------------------------
+    def _setup_attack(self) -> None:
+        """Sample attack type and parameters based on current task_id."""
+        self.attack_start_step = sample_attack_start(self.rng)
+        if self.task_id == 0:
+            # Easy: sinusoidal FDI only
+            self.attack_params    = sample_sinusoidal_params(self.rng)
+            self.true_attack_type = 1
+        elif self.task_id == 1:
+            # Medium: random choice of sinusoidal / ramp / pulse
+            choice = int(self.rng.integers(0, 3))
+            if choice == 0:
+                self.attack_params    = sample_sinusoidal_params(self.rng)
+                self.true_attack_type = 1
+            elif choice == 1:
+                self.attack_params    = sample_ramp_params(self.rng)
+                self.true_attack_type = 2
+            else:
+                self.attack_params    = sample_pulse_params(self.rng)
+                self.true_attack_type = 3
+        elif self.task_id == 2:
+            # Hard: stealthy low-and-slow
+            self.attack_params    = sample_stealthy_params(self.rng)
+            self.true_attack_type = 4
+        self.attack_type      = get_attack_type_id(self.attack_params.get("type", "none"))
+        self.attack_generator = AttackGenerator(self.attack_params, self.attack_start_step)
+    def _get_observation(self) -> Observation:
+        """
+        Build the current Observation from internal windows.
+        Fix 5: theta_err_window replaced with omega_deviation_window.
+        theta_err requires knowing theta_true (not observable in a real
+        inverter) and leaked ground truth directly to the agent.
+        omega_deviation (omega_hat - OMEGA0 in rad/s) is a realistic proxy
+        that correlates with phase drift under stealthy attacks.
+        """
+        return Observation(
+            vq_window=list(self.vq_window),
+            vd_window=list(self.vd_window),
+            omega_window=list(self.omega_window),
+            omega_deviation_window=list(self.omega_deviation_window),  # Fix 5
+            raw_voltages=[self.pll.va_m, self.pll.vb_m, self.pll.vc_m],
+            task_id=self.task_id,
+            step=self.step_count,
+        )
+    def _compute_grader_score(self) -> float:
+        """Run the appropriate grader at episode end."""
+        if self.task_id == 0:
+            return grade_task_easy(self.history, self.attack_start_step)
+        elif self.task_id == 1:
+            return grade_task_medium(self.history, self.attack_start_step)
+        elif self.task_id == 2:
+            return grade_task_hard(
+                self.history,
+                self.lock_loss_step,
+                self.attack_start_step,
+            )
+        return 0.0

src/graders.py ADDED Viewed

	@@ -0,0 +1,138 @@

+"""
+Per-task deterministic graders for the PLL Cyberattack Detection OpenEnv.
+Each grader takes an episode history and returns a score in [0.0, 1.0].
+Graders are deterministic given the same episode data.
+"""
+from typing import List, Dict, Any, Optional
+def grade_task_easy(history: List[Dict[str, Any]], attack_start_step: int) -> float:
+    """
+    Task 1 — Sinusoidal FDI Detection (Easy).
+    Grader logic (relative to attack onset):
+      delay = first_correct_detection_step - attack_start_step
+      if delay <= 20:   score = 1.0
+      elif delay <= 100: score = linear decay from 1.0 to 0.5
+      elif delay <= 420: score = 0.2
+      else (never detected): score = 0.0
+    """
+    first_correct_detection_step = None
+    for entry in history:
+        step = entry["step"]
+        attack_active = entry["attack_active"]
+        attack_detected = entry["attack_detected"]
+        if attack_active and attack_detected:
+            first_correct_detection_step = step
+            break
+    if first_correct_detection_step is None:
+        return 0.0
+    delay = first_correct_detection_step - attack_start_step
+    if delay <= 20:
+        return 1.0
+    elif delay <= 100:
+        # Linear decay from 1.0 at delay=20 to 0.5 at delay=100
+        return 1.0 - 0.5 * (delay - 20) / 80.0
+    elif delay <= 420:
+        return 0.2
+    else:
+        return 0.0
+def grade_task_medium(history: List[Dict[str, Any]], attack_start_step: int) -> float:
+    """
+    Task 2 — Multi-Attack Classification (Medium).
+    Grader logic:
+      base_score = fraction of steps (after attack_start) where attack_type is correctly classified
+      early_bonus = 0.4 * max(0, 1 - first_correct_classification_step / 100)
+      score = min(1.0, base_score * 0.6 + early_bonus)
+    """
+    steps_after_attack = 0
+    correct_classifications = 0
+    first_correct_classification_step = None
+    for entry in history:
+        step = entry["step"]
+        if step < attack_start_step:
+            continue
+        steps_after_attack += 1
+        true_type = entry["true_attack_type"]
+        agent_type = entry["agent_attack_type"]
+        if agent_type == true_type:
+            correct_classifications += 1
+            if first_correct_classification_step is None:
+                first_correct_classification_step = step
+    if steps_after_attack == 0:
+        return 0.0
+    base_score = correct_classifications / steps_after_attack
+    if first_correct_classification_step is not None:
+        early_bonus = 0.4 * max(0.0, 1.0 - first_correct_classification_step / 100.0)
+    else:
+        early_bonus = 0.0
+    score = min(1.0, base_score * 0.6 + early_bonus)
+    return max(0.0, score)
+def grade_task_hard(
+    history: List[Dict[str, Any]],
+    loss_of_lock_step: Optional[int],
+    attack_start_step: int,
+) -> float:
+    """
+    Task 3 — Stealthy Low-and-Slow Attack (Hard).
+    Grader logic:
+      if detected before loss_of_lock_step:
+          score = 1.0 * (1 - first_detection_step / loss_of_lock_step)
+      elif detected after loss_of_lock but before episode end:
+          score = 0.3
+      else (never detected):
+          score = 0.0
+      false_alarm_penalty = 0.2 per false alarm before attack starts
+      (capped at reducing score to 0.0 minimum)
+    """
+    first_detection_step = None
+    false_alarm_count = 0
+    for entry in history:
+        step = entry["step"]
+        attack_active = entry["attack_active"]
+        attack_detected = entry["attack_detected"]
+        # Only count false alarms before the attack starts
+        if attack_detected and not attack_active and step < attack_start_step:
+            false_alarm_count += 1
+        if attack_detected and attack_active and first_detection_step is None:
+            first_detection_step = step
+    # Compute base score
+    if first_detection_step is None:
+        score = 0.0
+    elif loss_of_lock_step is not None and first_detection_step < loss_of_lock_step:
+        score = 1.0 * (1.0 - first_detection_step / loss_of_lock_step)
+    elif loss_of_lock_step is not None and first_detection_step >= loss_of_lock_step:
+        score = 0.3
+    else:
+        # No loss of lock occurred but attack was detected
+        score = 0.3
+    # Apply false alarm penalty
+    penalty = 0.2 * false_alarm_count
+    score = max(0.0, score - penalty)
+    return min(1.0, score)

src/models.py ADDED Viewed

	@@ -0,0 +1,74 @@

+"""
+Pydantic models for the PLL Cyberattack Detection OpenEnv.
+Defines Observation, Action, Reward, and State schemas.
+"""
+import numpy as np
+from typing import Annotated, Any, Dict, List, Optional
+from pydantic import BaseModel, Field, model_validator
+# Exactly 20 floats — enforced at validation time, not just documented.
+WindowList = Annotated[List[float], Field(min_length=20, max_length=20)]
+# Exactly 3 floats for [va, vb, vc].
+VoltageList = Annotated[List[float], Field(min_length=3, max_length=3)]
+class Observation(BaseModel):
+    vq_window: WindowList
+    vd_window: WindowList
+    omega_window: WindowList
+    omega_deviation_window: WindowList
+    raw_voltages: VoltageList
+    task_id: int = Field(ge=0, le=2)
+    step: int = Field(ge=0)
+class Action(BaseModel):
+    attack_detected: bool
+    attack_type: int = Field(ge=0, le=4)
+    confidence: float = Field(ge=0.0, le=1.0)
+    protective_action: int = Field(ge=0, le=3)
+class Reward(BaseModel):
+    total: float
+    detection_reward: float
+    classification_bonus: float
+    early_detection_bonus: float
+    false_alarm_penalty: float
+    lock_loss_penalty: float
+class State(BaseModel):
+    theta_true: float
+    theta_hat: float
+    omega_hat: float
+    vq_integral: float
+    attack_active: bool
+    attack_type: int         # Integer ID of the current attack: 0=none, 1=sinusoidal, 2=ramp, 3=pulse, 4=stealthy.
+    attack_params: Dict[str, Any]
+    attack_start_step: int
+    lock_lost: bool     # Whether the PLL has lost lock (|theta_err| > 5°). Task 2 only.
+    step: int = Field(ge=0)
+    episode_id: str
+    task_id: int = Field(ge=0, le=2)
+    @model_validator(mode="before")
+    @classmethod
+    def coerce_attack_params(cls, values: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Coerce numpy scalar types inside attack_params to native Python types.
+        sample_*_params() casts with float()/int() but a future contributor
+        may forget. This validator ensures JSON serialization never fails due
+        to np.float32 / np.int64 / np.bool_ leaking into the params dict.
+        """
+        params = values.get("attack_params", {})
+        if isinstance(params, dict):
+            coerced = {}
+            for k, v in params.items():
+                if isinstance(v, np.floating):
+                    coerced[k] = float(v)
+                elif isinstance(v, np.integer):
+                    coerced[k] = int(v)
+                elif isinstance(v, np.bool_):
+                    coerced[k] = bool(v)
+                else:
+                    coerced[k] = v
+            values["attack_params"] = coerced
+        return values

src/pll_sim.py ADDED Viewed

	@@ -0,0 +1,119 @@

+"""
+SRF-PLL Discrete-Time Simulation.
+Implements the Synchronous Reference Frame Phase-Locked Loop used in
+grid-connected inverters. Discrete time step Δt = 1 ms.
+Steps:
+  1. Generate true 3-phase grid voltages (50 Hz, 1.0 pu)
+  2. Apply attack injection on va
+  3. Clarke transform (αβ)
+  4. Park transform (dq) using estimated angle θ̂
+  5. PI controller to update ω̂ and θ̂
+  6. Compute phase error
+"""
+import numpy as np
+import math
+# Constants
+V_NOM = 1.0           # Nominal voltage (pu)
+F0 = 50.0             # Grid frequency (Hz)
+OMEGA0 = 2.0 * math.pi * F0  # Nominal angular freq (rad/s)
+DT = 1e-3             # Time step (1 ms)
+KP = 50.0             # PI proportional gain
+KI = 1500.0           # PI integral gain
+def wrap_angle(angle: float) -> float:
+    """Wrap angle to [-π, π]."""
+    return (angle + math.pi) % (2.0 * math.pi) - math.pi
+class SRFPLLSimulator:
+    """Discrete-time SRF-PLL simulator."""
+    def __init__(self):
+        self.reset()
+    def reset(self):
+        """Reset PLL state to initial conditions."""
+        self.t = 0.0              # Simulation time (s)
+        self.theta_true = 0.0     # True grid angle (rad)
+        self.theta_hat = 0.0      # Estimated angle (rad)
+        self.omega_hat = OMEGA0   # Estimated angular freq (rad/s)
+        self.vq_integral = 0.0    # Integral of vq for PI controller
+        # Current signal values
+        self.vd = 0.0
+        self.vq = 0.0
+        self.va_m = 0.0
+        self.vb_m = 0.0
+        self.vc_m = 0.0
+        self.theta_err = 0.0
+    def step(self, attack_signal: float = 0.0):
+        """
+        Advance the PLL by one time step.
+        Args:
+            attack_signal: Attack injection added to va (pu).
+        Returns:
+            dict with vd, vq, omega_hat, theta_err, va_m, vb_m, vc_m, theta_true, theta_hat
+        """
+        # Step 1 — True three-phase grid voltages
+        va = V_NOM * math.sin(self.theta_true)
+        vb = V_NOM * math.sin(self.theta_true - 2.0 * math.pi / 3.0)
+        vc = V_NOM * math.sin(self.theta_true + 2.0 * math.pi / 3.0)
+        # Step 2 — Apply attack injection on va
+        va_m = va + attack_signal
+        vb_m = vb
+        vc_m = vc
+        # Step 3 — Clarke Transform (αβ)
+        v_alpha = va_m
+        v_beta = (va_m + 2.0 * vb_m) / math.sqrt(3.0)
+        # Step 4 — Park Transform (dq) using estimated angle θ̂
+        cos_th = math.cos(self.theta_hat)
+        sin_th = math.sin(self.theta_hat)
+        vd = v_alpha * cos_th + v_beta * sin_th
+        vq = -v_alpha * sin_th + v_beta * cos_th
+        # Step 5 — PI Controller
+        self.vq_integral += vq * DT
+        omega_hat = OMEGA0 + KP * vq + KI * self.vq_integral
+        self.theta_hat += omega_hat * DT
+        # Advance true angle
+        self.theta_true += OMEGA0 * DT
+        # Step 6 — Phase error wrapped to [-π, π]
+        theta_err = wrap_angle(self.theta_hat - self.theta_true)
+        # Update time
+        self.t += DT
+        # Store current values
+        self.vd = vd
+        self.vq = vq
+        self.omega_hat = omega_hat
+        self.va_m = va_m
+        self.vb_m = vb_m
+        self.vc_m = vc_m
+        self.theta_err = theta_err
+        return {
+            "vd": vd,
+            "vq": vq,
+            "omega_hat": omega_hat,
+            "theta_err": theta_err,
+            "va_m": va_m,
+            "vb_m": vb_m,
+            "vc_m": vc_m,
+            "theta_true": self.theta_true,
+            "theta_hat": self.theta_hat,
+        }

uv.lock ADDED Viewed

The diff for this file is too large to render. See raw diff