open_env / README.md
iitian's picture
Standardize API environment variables, update port to 7860, and bump version to 0.2.0
547b872
metadata
title: Cloud Security Auditor
emoji: πŸ›‘οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0

πŸ›‘οΈ CloudSecurityAuditor OpenEnv (v0.2.0)

CloudSecurityAuditor is a high-fidelity, standardized AI agent environment designed to simulate real-world cloud security audit scenarios. Built upon the OpenEnv specification, it provides a safe, reproducible sandbox where autonomous agents can practice identifying, analyzing, and remediating critical security vulnerabilities in a mock cloud infrastructure.

This environment is specifically engineered for benchmarking LLM-based security agents, offering a structured API and deterministic evaluation metrics.

🌟 Key Features

  • Standardized API: Fully compliant with the openenv-core specification, featuring Gymnasium-style step(), reset(), and state() methods.
  • Realistic Cloud Mocking: Simulates S3 bucket configurations, EC2 security groups, and IAM audit logs with high precision.
  • Multi-Tiered Evaluation:
    • Easy (Audit): Focuses on information gathering and resource tagging.
    • Medium (Remediation): Requires active patching and configuration changes.
    • Hard (Forensics): Demands log analysis and pattern matching to identify rogue actors.
  • Typed Observations: Robust Pydantic-based action and observation models ensure reliable agent-environment interactions.
  • Automated Grading: Scalar reward functions (0.0 to 1.0) provide immediate, granular feedback on agent performance.

πŸ›  Action & Observation Space

Actions

  • list: Inventory resources (s3, ec2).
  • describe: Deep-dive into resource metadata.
  • modify: Apply security patches and rule updates.
  • logs: Extract forensic evidence from authentication logs.
  • submit: Finalize the task with a structured answer.

Observations

  • resources: Comprehensive resource records.
  • details: Metadata for specific entities.
  • logs: Event-based log entries.
  • status: Execution status and helper messages.

πŸ“Š Available Tasks

ID Name Objective Difficulty
easy S3 Public Audit Identify public 'prod' buckets. Auditing
medium EC2 Security Patch Remediate open RDP ports (3389). Remediation
hard IAM Log Forensic Trace 'DeleteStorage' actions in logs. Forensics

πŸš€ Quick Start (Hugging Face)

If you are running this in a Hugging Face Space:

  1. Examine the API: The environment is hosted as a FastAPI server. Use the /ui endpoint for a visual dashboard.
  2. Inference (LLM Agent): Set API_BASE_URL and API_KEY (e.g., from LiteLLM proxy) then run python inference.py.
  3. Evaluate: The AI agent creates standardized logs for automated evaluation.

🐳 Local Deployment

# Clone and Install
pip install -r requirements.txt

# Run Server (Default port 7860)
python -m server.app

# Run Baseline (Rule-based)
python scripts/baseline_inference.py

# Run LLM Agent (Using API_BASE_URL and API_KEY)
export API_BASE_URL="https://api.openai.com/v1"
export API_KEY="your-key"
python inference.py

Built with ❀️ for the AI Security community.