Spaces:

sh4shv4t
/

statestrike-env

Sleeping

App Files Files Community

statestrike-env / README.md

sh4shv4t

docs: add HF Space YAML frontmatter with openenv tag

5383444 about 1 month ago

preview code

raw

history blame contribute delete

5.12 kB

metadata

title: RyFlow
emoji: 🌊
colorFrom: purple
colorTo: red
sdk: docker
python_version: '3.11'
app_file: app.py
pinned: false
tags:
  - openenv

/ | | | | / __ | | () | | ( | |_ __ | | | | | | | _ | | _ | __/ ` | / _ \ | | | | | | | | | _) | || (| | || / || | | |_| | | | |__/ __,|__|____|_,||_|

StateStrike Security Audit Environment An OpenEnv-ready stateful API security environment for real-world vulnerability triage.

Environment Description and Motivation

StateStrike models a practical security engineering workflow: systematic API auditing to discover, classify, and chain exploitable behaviors in a production-like service.

Unlike toy game environments, the agent performs genuine tasks security teams run in real engagements:

Endpoint reachability mapping
Vulnerability probing and classification
Stateful exploit-chain execution

This design creates measurable operational value: better API hardening and earlier detection of latency-amplifying attack paths.

Action Space

Field	Type	Description	Values
endpoint	EndpointChoice	Target API operation	POST /users, GET /users/{id}, POST /orders, GET /orders, GET /health
payload_strategy	PayloadStrategy	Payload mutation strategy	valid, redos, oversized, malformed
target_user_id	Optional[int]	User context for stateful calls	null or integer user id

Observation Space

Field	Type	Description
step	int	Current episode step
endpoint_called	str	Executed endpoint
http_status	int	HTTP response code
latency_ms	float	Request latency in milliseconds
response_body	dict[str, Any]	Parsed response payload
session_order_count	int	Number of orders created in session
endpoints_discovered	list[str]	Reachable endpoints found so far
vulnerabilities_found	list[str]	Confirmed vulnerability labels
task_progress	float	Normalized task completion in [0.0, 1.0]

Task Descriptions

Task	Difficulty	Max Steps	Success Threshold	Description
endpoint_discovery	easy	20	0.60	Find all reachable API endpoints
vulnerability_probe	medium	30	0.50	Find and classify vulnerabilities (redos, db_degradation)
exploit_chain	hard	60	0.75	Execute full stateful exploit chain with evidence

Reward Function

Step reward is normalized to [0.0, 1.0] and shaped by true task progress:

R_step = clamp(Delta task_score + bonuses - penalties)

Components:

Delta task score: max(0, score_t - score_t-1), capped to 0.30
+0.05 for a newly discovered endpoint
+0.10 for a newly confirmed vulnerability
-0.02 for repeated identical no-op action
+0.20 terminal completion bonus when task is solved

Anti-hacking properties:

One-time vulnerability flags prevent bounty farming
Chain cooldown and order-growth guards prevent POST/GET cycling exploits
Baseline latency updated via EMA only on successful steps
Connection failures produce neutral reward and never corrupt baseline

Setup Instructions

Docker (single command)

docker build -t statestrike .
docker run -p 7860:7860 statestrike

Local Python

python -m pip install -r requirements.txt
cp .env.example .env
uvicorn honeypot.app:app --host 0.0.0.0 --port 8000
HONEYPOT_URL=http://localhost:8000 uvicorn statestrike_env.environment:app --host 0.0.0.0 --port 7860
python inference.py

HF Space URL

Set this to your deployed environment Space URL:

https://sh4shv4t-statestrike-env.hf.space

Baseline Scores

Task	Baseline Score	Model
endpoint_discovery	0.600	Qwen/Qwen2.5-72B-Instruct
vulnerability_probe	0.400	Qwen/Qwen2.5-72B-Instruct
exploit_chain	0.000	Qwen/Qwen2.5-72B-Instruct

OpenEnv Compliance Checklist

Real-world task framing (security audit)
Typed Pydantic action/observation/state models
reset(), step(), state(), close() implemented
Three graded tasks (easy, medium, hard)
Graders produce normalized scores in [0.0, 1.0]
Partial-progress reward shaping
Root inference.py with [START]/[STEP]/[END] format
Root openenv.yaml manifest
Single-container Docker runtime with /health and /reset

Architecture Diagram

+-------------------------------+
| HF Space Container            |
|  +-------------------------+  |
|  | Honeypot API :8000      |  |
|  +-------------------------+  |
|  | OpenEnv Server :7860    |  |
|  | /reset /step /state     |  |
|  +-------------------------+  |
+---------------+---------------+
                |
                v
       inference.py (LLM agent)

License

MIT