statestrike-env / README.md
sh4shv4t's picture
docs: add HF Space YAML frontmatter with openenv tag
5383444
metadata
title: RyFlow
emoji: 🌊
colorFrom: purple
colorTo: red
sdk: docker
python_version: '3.11'
app_file: app.py
pinned: false
tags:
  - openenv

/ | | | | / __ | | () | | ( | |_ __ | | | | | | | _ | | _ | __/ ` | / _ \ | | | | | | | | | _) | || (| | || / || | | |_| | | | |__/ __,|__|____|_,||_|

StateStrike Security Audit Environment An OpenEnv-ready stateful API security environment for real-world vulnerability triage.

Python 3.11 OpenEnv Compliant MIT License HF Spaces

Environment Description and Motivation

StateStrike models a practical security engineering workflow: systematic API auditing to discover, classify, and chain exploitable behaviors in a production-like service.

Unlike toy game environments, the agent performs genuine tasks security teams run in real engagements:

  • Endpoint reachability mapping
  • Vulnerability probing and classification
  • Stateful exploit-chain execution

This design creates measurable operational value: better API hardening and earlier detection of latency-amplifying attack paths.

Action Space

Field Type Description Values
endpoint EndpointChoice Target API operation POST /users, GET /users/{id}, POST /orders, GET /orders, GET /health
payload_strategy PayloadStrategy Payload mutation strategy valid, redos, oversized, malformed
target_user_id Optional[int] User context for stateful calls null or integer user id

Observation Space

Field Type Description
step int Current episode step
endpoint_called str Executed endpoint
http_status int HTTP response code
latency_ms float Request latency in milliseconds
response_body dict[str, Any] Parsed response payload
session_order_count int Number of orders created in session
endpoints_discovered list[str] Reachable endpoints found so far
vulnerabilities_found list[str] Confirmed vulnerability labels
task_progress float Normalized task completion in [0.0, 1.0]

Task Descriptions

Task Difficulty Max Steps Success Threshold Description
endpoint_discovery easy 20 0.60 Find all reachable API endpoints
vulnerability_probe medium 30 0.50 Find and classify vulnerabilities (redos, db_degradation)
exploit_chain hard 60 0.75 Execute full stateful exploit chain with evidence

Reward Function

Step reward is normalized to [0.0, 1.0] and shaped by true task progress:

R_step = clamp(Delta task_score + bonuses - penalties)

Components:

  • Delta task score: max(0, score_t - score_t-1), capped to 0.30
  • +0.05 for a newly discovered endpoint
  • +0.10 for a newly confirmed vulnerability
  • -0.02 for repeated identical no-op action
  • +0.20 terminal completion bonus when task is solved

Anti-hacking properties:

  • One-time vulnerability flags prevent bounty farming
  • Chain cooldown and order-growth guards prevent POST/GET cycling exploits
  • Baseline latency updated via EMA only on successful steps
  • Connection failures produce neutral reward and never corrupt baseline

Setup Instructions

Docker (single command)

docker build -t statestrike .
docker run -p 7860:7860 statestrike

Local Python

python -m pip install -r requirements.txt
cp .env.example .env
uvicorn honeypot.app:app --host 0.0.0.0 --port 8000
HONEYPOT_URL=http://localhost:8000 uvicorn statestrike_env.environment:app --host 0.0.0.0 --port 7860
python inference.py

HF Space URL

Set this to your deployed environment Space URL:

Baseline Scores

Task Baseline Score Model
endpoint_discovery 0.600 Qwen/Qwen2.5-72B-Instruct
vulnerability_probe 0.400 Qwen/Qwen2.5-72B-Instruct
exploit_chain 0.000 Qwen/Qwen2.5-72B-Instruct

OpenEnv Compliance Checklist

  • Real-world task framing (security audit)
  • Typed Pydantic action/observation/state models
  • reset(), step(), state(), close() implemented
  • Three graded tasks (easy, medium, hard)
  • Graders produce normalized scores in [0.0, 1.0]
  • Partial-progress reward shaping
  • Root inference.py with [START]/[STEP]/[END] format
  • Root openenv.yaml manifest
  • Single-container Docker runtime with /health and /reset

Architecture Diagram

+-------------------------------+
| HF Space Container            |
|  +-------------------------+  |
|  | Honeypot API :8000      |  |
|  +-------------------------+  |
|  | OpenEnv Server :7860    |  |
|  | /reset /step /state     |  |
|  +-------------------------+  |
+---------------+---------------+
                |
                v
       inference.py (LLM agent)

License

MIT