Spaces:

iitian
/

open_env

Sleeping

App Files Files Community

iitian commited on 13 days ago

Commit

547b872

1 Parent(s): c725cbc

Standardize API environment variables, update port to 7860, and bump version to 0.2.0

Browse files

Files changed (6) hide show

DOCUMENTATION.md +7 -7
README.md +10 -5
inference.py +7 -7
openenv.yaml +1 -1
pyproject.toml +1 -1
scripts/baseline_inference.py +1 -1

DOCUMENTATION.md CHANGED Viewed

@@ -208,7 +208,7 @@ Every `step()` and `reset()` returns a `CloudObservation`:
 ## 9. API Reference
-Base URL: `http://localhost:8000`
 ### `POST /reset`
 Reset the environment to a specific task.
@@ -286,7 +286,7 @@ Dashboard UI (the web interface).
 ## 10. Dashboard UI
-The application includes a premium dark-mode cybersecurity dashboard accessible at `http://localhost:8000`.
 ### Features
 - **Sidebar Task Selector** — Switch between Easy, Medium, and Hard challenges with one click.
@@ -314,7 +314,7 @@ pip install -r requirements.txt
 python -m server.app
 # Open in browser
-open http://localhost:8000
 ```
 ### Running the Baseline Agent
@@ -329,7 +329,7 @@ python scripts/baseline_inference.py
 docker build -t cloud-security-auditor .
 # Run the container
-docker run -p 8000:8000 cloud-security-auditor
 ```
 ### Hugging Face Spaces Deployment
@@ -357,14 +357,14 @@ docker run -p 8000:8000 cloud-security-auditor
 ```yaml
 name: cloud-security-auditor
-version: "0.1.0"
 description: "A real-world cloud security audit environment for AI agents."
 hardware:
   tier: "cpu-small"
   vCPU: 2
   RAM: 4Gi
-port: 8000
-entrypoint: "uvicorn server.app:app --host 0.0.0.0 --port 8000"
 tags:
   - security
   - cloud

 ## 9. API Reference
+Base URL: `http://localhost:7860`
 ### `POST /reset`
 Reset the environment to a specific task.
 ## 10. Dashboard UI
+The application includes a premium dark-mode cybersecurity dashboard accessible at `http://localhost:7860`.
 ### Features
 - **Sidebar Task Selector** — Switch between Easy, Medium, and Hard challenges with one click.
 python -m server.app
 # Open in browser
+open http://localhost:7860
 ```
 ### Running the Baseline Agent
 docker build -t cloud-security-auditor .
 # Run the container
+docker run -p 7860:7860 cloud-security-auditor
 ```
 ### Hugging Face Spaces Deployment
 ```yaml
 name: cloud-security-auditor
+version: "0.2.0"
 description: "A real-world cloud security audit environment for AI agents."
 hardware:
   tier: "cpu-small"
   vCPU: 2
   RAM: 4Gi
+port: 7860
+entrypoint: "uvicorn server.app:app --host 0.0.0.0 --port 7860"
 tags:
   - security
   - cloud

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ pinned: false
 license: apache-2.0
 ---
-# 🛡️ CloudSecurityAuditor OpenEnv
 **CloudSecurityAuditor** is a high-fidelity, standardized AI agent environment designed to simulate real-world cloud security audit scenarios. Built upon the **OpenEnv** specification, it provides a safe, reproducible sandbox where autonomous agents can practice identifying, analyzing, and remediating critical security vulnerabilities in a mock cloud infrastructure.
@@ -54,8 +54,8 @@ This environment is specifically engineered for benchmarking LLM-based security
 If you are running this in a **Hugging Face Space**:
 1.  **Examine the API**: The environment is hosted as a FastAPI server. Use the `/ui` endpoint for a visual dashboard.
-2.  **Inference**: Run the `inference.py` script locally, pointing the `ENV_URL` to your Space's URL.
-3.  **Evaluate**: The system will emit standardized logs for automated leaderboard tracking.
 ## 🐳 Local Deployment
@@ -63,10 +63,15 @@ If you are running this in a **Hugging Face Space**:
 # Clone and Install
 pip install -r requirements.txt
-# Run Server
 python -m server.app
-# Run Baseline
 python inference.py
 ```

 license: apache-2.0
 ---
+# 🛡️ CloudSecurityAuditor OpenEnv (v0.2.0)
 **CloudSecurityAuditor** is a high-fidelity, standardized AI agent environment designed to simulate real-world cloud security audit scenarios. Built upon the **OpenEnv** specification, it provides a safe, reproducible sandbox where autonomous agents can practice identifying, analyzing, and remediating critical security vulnerabilities in a mock cloud infrastructure.
 If you are running this in a **Hugging Face Space**:
 1.  **Examine the API**: The environment is hosted as a FastAPI server. Use the `/ui` endpoint for a visual dashboard.
+2.  **Inference (LLM Agent)**: Set `API_BASE_URL` and `API_KEY` (e.g., from LiteLLM proxy) then run `python inference.py`.
+3.  **Evaluate**: The AI agent creates standardized logs for automated evaluation.
 ## 🐳 Local Deployment
 # Clone and Install
 pip install -r requirements.txt
+# Run Server (Default port 7860)
 python -m server.app
+# Run Baseline (Rule-based)
+python scripts/baseline_inference.py
+# Run LLM Agent (Using API_BASE_URL and API_KEY)
+export API_BASE_URL="https://api.openai.com/v1"
+export API_KEY="your-key"
 python inference.py
 ```

inference.py CHANGED Viewed

@@ -5,9 +5,9 @@ Uses an LLM (via OpenAI-compatible client) to autonomously solve all 3 security
 Emits structured [START], [STEP], [END] logs for automated evaluation.
 Required environment variables:
-  API_BASE_URL  — The API endpoint for the LLM (e.g., https://openrouter.ai/api/v1)
-  MODEL_NAME    — The model identifier (e.g., openai/gpt-4o-mini)
-  HF_TOKEN      — Your Hugging Face / API key
 """
 import os
@@ -21,9 +21,9 @@ from openai import OpenAI
 # ──────────────────────────────────────────────
 # Configuration from environment variables
 # ──────────────────────────────────────────────
-API_BASE_URL = os.getenv("API_BASE_URL", "https://openrouter.ai/api/v1")
-MODEL_NAME = os.getenv("MODEL_NAME", "openai/gpt-4o-mini")
-HF_TOKEN = os.getenv("HF_TOKEN", "")
 LOCAL_IMAGE_NAME = os.getenv("LOCAL_IMAGE_NAME", "")
 ENV_URL = os.getenv("ENV_URL", "http://localhost:8000")
@@ -32,7 +32,7 @@ BENCHMARK_NAME = "cloud-security-auditor"
 # Initialize OpenAI-compatible client
 client = OpenAI(
     base_url=API_BASE_URL,
-    api_key=HF_TOKEN,
 )
 # ──────────────────────────────────────────────

 Emits structured [START], [STEP], [END] logs for automated evaluation.
 Required environment variables:
+  API_BASE_URL  — The API endpoint for the LLM (e.g., https://api.openai.com/v1)
+  MODEL_NAME    — The model identifier (e.g., gpt-4o-mini)
+  API_KEY       — Your API key for the LLM proxy
 """
 import os
 # ──────────────────────────────────────────────
 # Configuration from environment variables
 # ──────────────────────────────────────────────
+API_BASE_URL = os.environ.get("API_BASE_URL", "https://api.openai.com/v1")
+API_KEY = os.environ.get("API_KEY", "")
+MODEL_NAME = os.environ.get("MODEL_NAME", "gpt-4o-mini")
 LOCAL_IMAGE_NAME = os.getenv("LOCAL_IMAGE_NAME", "")
 ENV_URL = os.getenv("ENV_URL", "http://localhost:8000")
 # Initialize OpenAI-compatible client
 client = OpenAI(
     base_url=API_BASE_URL,
+    api_key=API_KEY,
 )
 # ──────────────────────────────────────────────

openenv.yaml CHANGED Viewed

@@ -1,5 +1,5 @@
 name: cloud-security-auditor
-version: "0.1.0"
 description: "A real-world cloud security audit environment for AI agents."
 hardware:
   tier: "cpu-small"

 name: cloud-security-auditor
+version: "0.2.0"
 description: "A real-world cloud security audit environment for AI agents."
 hardware:
   tier: "cpu-small"

pyproject.toml CHANGED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "cloud-security-auditor"
-version = "0.1.0"
 description = "A real-world cloud security audit environment for AI agents."
 readme = "README.md"
 requires-python = ">=3.10"

 [project]
 name = "cloud-security-auditor"
+version = "0.2.0"
 description = "A real-world cloud security audit environment for AI agents."
 readme = "README.md"
 requires-python = ">=3.10"

scripts/baseline_inference.py CHANGED Viewed

@@ -1,7 +1,7 @@
 import requests
 import json
-BASE_URL = "http://localhost:8000"
 def run_baseline_audit(task_id="easy"):
     print(f"--- Running Baseline for Task: {task_id} ---")

 import requests
 import json
+BASE_URL = "http://localhost:7860"
 def run_baseline_audit(task_id="easy"):
     print(f"--- Running Baseline for Task: {task_id} ---")