# β˜οΈπŸ›‘οΈ CloudSecurityAuditor β€” OpenEnv Environment ## Complete Application Documentation --- ## 1. What Is This Application? **CloudSecurityAuditor** is a standardized AI agent environment that simulates real-world cloud security auditing scenarios. It is built using the [OpenEnv](https://github.com/openenv) specification β€” an open standard for creating reproducible, programmable environments where AI agents can be trained, tested, and benchmarked. Think of it as a **virtual cybersecurity lab**: instead of risking real cloud infrastructure, an AI agent (or a human) can interact with a mock cloud environment that contains intentional security vulnerabilities. The agent must discover, analyze, and remediate those vulnerabilities to earn a reward. ### Who Is This For? | Audience | Use Case | |---|---| | **AI Researchers** | Benchmark LLM-based security agents on structured tasks | | **Security Engineers** | Practice cloud audit workflows in a safe sandbox | | **Students** | Learn about S3 public buckets, EC2 security groups, and IAM log analysis | | **Hackathon Participants** | Demonstrate agent-environment interaction for Meta/OpenEnv challenges | --- ## 2. Architecture Overview ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ BROWSER (UI) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Sidebar β”‚ β”‚ Resource Gridβ”‚ β”‚ Execution Logβ”‚ β”‚ β”‚ β”‚ (Tasks) β”‚ β”‚ (S3 / EC2) β”‚ β”‚ (Terminal) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ HTTP (REST) β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ FastAPI Server (app.py) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ /reset β”‚ β”‚ /step β”‚ β”‚ /state / /docs β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β–Ό β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ CloudAuditEnv (environment.py) β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”‚ β”‚ S3 Data β”‚ β”‚EC2 Dataβ”‚ β”‚ Auth Logs β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` --- ## 3. File Structure ``` scaler/ β”œβ”€β”€ server/ β”‚ β”œβ”€β”€ app.py # FastAPI entry point, static file serving β”‚ β”œβ”€β”€ environment.py # Core environment logic (reset, step, state) β”‚ β”œβ”€β”€ models.py # Pydantic/dataclass models (Action, Observation, State) β”‚ β”œβ”€β”€ tasks.py # Task definitions (Easy, Medium, Hard) β”‚ └── static/ β”‚ β”œβ”€β”€ index.html # Dashboard UI layout β”‚ β”œβ”€β”€ index.css # Dark-mode cybersecurity theme β”‚ └── app.js # Frontend logic & API interaction β”œβ”€β”€ scripts/ β”‚ └── baseline_inference.py # Example agent that solves the Easy task β”œβ”€β”€ openenv.yaml # OpenEnv specification file β”œβ”€β”€ requirements.txt # Python dependencies β”œβ”€β”€ Dockerfile # Docker deployment configuration └── README.md # Quick-start guide ``` --- ## 4. The Environment Engine (`environment.py`) The heart of the application is the `CloudAuditEnv` class. It implements three methods required by the OpenEnv spec: ### `reset(task_id) β†’ Observation` - Reinitializes the mock infrastructure (S3 buckets, EC2 instances, auth logs). - Sets the active task (easy, medium, or hard). - Returns an initial observation with status info. ### `step(action) β†’ Observation` - Accepts a `CloudAction` and executes it against the mock infrastructure. - Returns an updated `CloudObservation` containing discovered resources, details, logs, and a reward signal. - Automatically terminates the episode after 20 steps (truncation). ### `state() β†’ CloudState` - Returns internal metadata: episode ID, step count, task ID, completion status, and cumulative score. --- ## 5. Mock Infrastructure The environment simulates the following cloud resources: ### S3 Buckets (3 total) | ID | Region | Public? | Environment | |---|---|---|---| | `prod-data-001` | us-east-1 | βœ… Yes | prod | | `prod-logs-002` | us-east-1 | ❌ No | prod | | `dev-test-01` | us-west-2 | βœ… Yes | dev | ### EC2 Instances (2 total) | ID | Type | State | Environment | Open Ports | |---|---|---|---|---| | `i-0abcdef1234567890` | t2.micro | running | dev | 22 (SSH), **3389 (RDP)** ⚠️ | | `i-0987654321fedcba0` | m5.large | running | prod | 443 (HTTPS) | ### Auth Logs (`auth-logs`) | Timestamp | User | Action | IP | |---|---|---|---| | 2026-04-05T10:00:00Z | admin | Login | 1.1.1.1 | | 2026-04-05T10:15:00Z | iam-role-01 | **DeleteStorage** ⚠️ | **192.168.1.50** | | 2026-04-05T10:30:00Z | user-02 | ListBuckets | 2.2.2.2 | --- ## 6. Action Space The agent interacts with the environment using a `CloudAction` object. Available action types: | Action | Parameters | Description | |---|---|---| | `list` | `resource_type` (s3, ec2) | Lists all resources of a given type | | `describe` | `resource_id` | Returns full details for a specific resource | | `modify` | `resource_id`, `patch` | Updates resource configuration (e.g., security group rules) | | `logs` | `resource_id` (e.g., auth-logs) | Fetches log entries for a service | | `submit` | `answer` | Submits the final answer for grading | ### Example Actions (via Dashboard or API) ```bash # List all S3 buckets list s3 # Describe an EC2 instance describe i-0abcdef1234567890 # Fetch authentication logs logs auth-logs # Submit an answer for Easy task submit prod-data-001 # Submit an answer for Hard task submit 192.168.1.50 ``` --- ## 7. Observation Space Every `step()` and `reset()` returns a `CloudObservation`: | Field | Type | Description | |---|---|---| | `resources` | `List[Dict]` | List of discovered resource records | | `details` | `Dict` | Full metadata for a single described resource | | `logs` | `List[Dict]` | Log entries (timestamp, user, action, IP) | | `status` | `str` | Human-readable status message | | `info` | `str` | Additional context (e.g., grading feedback) | | `reward` | `float` | Scalar reward (0.0 to 1.0) | | `done` | `bool` | Whether the episode has ended | --- ## 8. Tasks & Grading ### Task 1: Easy β€” S3 Public Audit **Goal:** Identify all S3 buckets that are both `public: true` AND tagged `env: prod`. | Step | Action | Expected Result | |---|---|---| | 1 | `list s3` | Returns 3 buckets | | 2 | Filter for public + prod | `prod-data-001` | | 3 | `submit prod-data-001` | Reward: **1.0** βœ… | --- ### Task 2: Medium β€” EC2 Security Patch **Goal:** Find EC2 instance `i-0abcdef1234567890` which has port 3389 (RDP) open to `0.0.0.0/0`, and close it by modifying the security group to only allow port 22. | Step | Action | Expected Result | |---|---|---| | 1 | `list ec2` | Returns 2 instances | | 2 | `describe i-0abcdef1234567890` | Shows RDP port open | | 3 | `modify i-0abcdef1234567890` with patch `{"rules": [{"port": 22, "cidr": "0.0.0.0/0"}]}` | Reward: **1.0** βœ… | --- ### Task 3: Hard β€” IAM Log Forensic **Goal:** A rogue IAM role (`iam-role-01`) has performed unauthorized actions. Analyze the `auth-logs` to identify the IP address that performed `DeleteStorage`. | Step | Action | Expected Result | |---|---|---| | 1 | `logs auth-logs` | Returns 3 log entries | | 2 | Find `DeleteStorage` action | IP: `192.168.1.50` | | 3 | `submit 192.168.1.50` | Reward: **1.0** βœ… | --- ## 9. API Reference Base URL: `http://localhost:7860` ### `POST /reset` Reset the environment to a specific task. **Request:** ```json { "task_id": "easy" } ``` **Response:** ```json { "observation": { "resources": null, "details": null, "status": null, "logs": null, "info": "Environment reset. Task: easy" }, "reward": 0.0, "done": false } ``` ### `POST /step` Execute an action in the environment. **Request:** ```json { "action": { "action": "list", "resource_type": "s3" } } ``` **Response:** ```json { "observation": { "resources": [ { "id": "prod-data-001", "region": "us-east-1", "public": true, "tags": { "env": "prod" } }, { "id": "prod-logs-002", "region": "us-east-1", "public": false, "tags": { "env": "prod" } }, { "id": "dev-test-01", "region": "us-west-2", "public": true, "tags": { "env": "dev" } } ], "status": "Listed 3 s3 resources." }, "reward": 0.0, "done": false } ``` ### `GET /state` Get internal environment state. **Response:** ```json { "episode_id": "a1b2c3d4-...", "step_count": 3, "task_id": "easy", "is_completed": false, "score": 0.0 } ``` ### `GET /docs` Interactive Swagger UI for API exploration. ### `GET /` Dashboard UI (the web interface). --- ## 10. Dashboard UI The application includes a premium dark-mode cybersecurity dashboard accessible at `http://localhost:7860`. ### Features - **Sidebar Task Selector** β€” Switch between Easy, Medium, and Hard challenges with one click. - **Infrastructure Overview** β€” Visual resource cards for S3 buckets and EC2 instances. Vulnerable resources are highlighted with red borders and blinking status dots. - **Execution Log** β€” Terminal-style console showing timestamped action logs with color-coded entries (blue for actions, green for system, yellow for rewards, red for errors). - **Manual Command Input** β€” Type commands like `list s3`, `describe i-0abcdef1234567890`, `logs auth-logs`, or `submit prod-data-001` directly in the dashboard. - **Live Stats HUD** β€” Displays current task name, cumulative reward, and environment status (Active/Completed). ### Design - **Theme:** Cyber-noir dark mode with deep navy background (#0a0e14) - **Accents:** Neon cyan (#00f5ff) for primary elements - **Typography:** Inter (body), Outfit (headings), JetBrains Mono (code/logs) - **Effects:** Glassmorphism panels, fade-in card animations, pulsing vulnerability indicators --- ## 11. Running the Application ### Local Development ```bash # Install dependencies pip install -r requirements.txt # Start the server python -m server.app # Open in browser open http://localhost:7860 ``` ### Running the Baseline Agent ```bash # Solves the Easy task automatically python scripts/baseline_inference.py ``` ### Docker Deployment ```bash # Build the image docker build -t cloud-security-auditor . # Run the container docker run -p 7860:7860 cloud-security-auditor ``` ### Hugging Face Spaces Deployment 1. Create a new Space on Hugging Face. 2. Select **Docker** as the SDK. 3. Upload the repository contents (including `openenv.yaml` and `Dockerfile`). 4. The entrypoint is automatically set via `openenv.yaml`. --- ## 12. Technology Stack | Component | Technology | |---|---| | Backend | Python 3.10, FastAPI, Uvicorn | | Environment | openenv-core β‰₯ 0.1.1 | | Data Models | Python dataclasses | | Frontend | Vanilla HTML/CSS/JS | | Fonts | Google Fonts (Inter, Outfit, JetBrains Mono) | | Deployment | Docker, Hugging Face Spaces | --- ## 13. OpenEnv Specification (`openenv.yaml`) ```yaml name: cloud-security-auditor version: "0.2.0" description: "A real-world cloud security audit environment for AI agents." hardware: tier: "cpu-small" vCPU: 2 RAM: 4Gi port: 7860 entrypoint: "uvicorn server.app:app --host 0.0.0.0 --port 7860" tags: - security - cloud - task-based evaluation: tasks: - id: "easy" name: "S3 Public Audit" difficulty: "easy" - id: "medium" name: "EC2 Security Patch" difficulty: "medium" - id: "hard" name: "IAM Log Forensic" difficulty: "hard" ``` --- ## 14. Extending the Environment ### Adding a New Task 1. Add the task definition to `server/tasks.py`. 2. Add the corresponding mock data to `_initialize_state()` in `environment.py`. 3. Add the grading logic to the `step()` method under `CloudActionType.SUBMIT`. 4. Add a new task button to `index.html` in the sidebar. ### Adding a New Resource Type 1. Add the resource data to `self.resources` in `environment.py`. 2. Add a handler for `CloudActionType.LIST` and `CloudActionType.DESCRIBE` for the new type. 3. Update `detectResourceType()` in `app.js` to render the correct card icon/label. --- *Built for the Meta Hackathon / OpenEnv Challenge β€’ April 2026*