metadata
title: Cloud Security Auditor
emoji: π‘οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
π‘οΈ CloudSecurityAuditor OpenEnv (v0.2.0)
CloudSecurityAuditor is a high-fidelity, standardized AI agent environment designed to simulate real-world cloud security audit scenarios. Built upon the OpenEnv specification, it provides a safe, reproducible sandbox where autonomous agents can practice identifying, analyzing, and remediating critical security vulnerabilities in a mock cloud infrastructure.
This environment is specifically engineered for benchmarking LLM-based security agents, offering a structured API and deterministic evaluation metrics.
π Key Features
- Standardized API: Fully compliant with the
openenv-corespecification, featuring Gymnasium-stylestep(),reset(), andstate()methods. - Realistic Cloud Mocking: Simulates S3 bucket configurations, EC2 security groups, and IAM audit logs with high precision.
- Multi-Tiered Evaluation:
- Easy (Audit): Focuses on information gathering and resource tagging.
- Medium (Remediation): Requires active patching and configuration changes.
- Hard (Forensics): Demands log analysis and pattern matching to identify rogue actors.
- Typed Observations: Robust Pydantic-based action and observation models ensure reliable agent-environment interactions.
- Automated Grading: Scalar reward functions (0.0 to 1.0) provide immediate, granular feedback on agent performance.
π Action & Observation Space
Actions
list: Inventory resources (s3,ec2).describe: Deep-dive into resource metadata.modify: Apply security patches and rule updates.logs: Extract forensic evidence from authentication logs.submit: Finalize the task with a structured answer.
Observations
resources: Comprehensive resource records.details: Metadata for specific entities.logs: Event-based log entries.status: Execution status and helper messages.
π Available Tasks
| ID | Name | Objective | Difficulty |
|---|---|---|---|
easy |
S3 Public Audit | Identify public 'prod' buckets. | Auditing |
medium |
EC2 Security Patch | Remediate open RDP ports (3389). | Remediation |
hard |
IAM Log Forensic | Trace 'DeleteStorage' actions in logs. | Forensics |
π Quick Start (Hugging Face)
If you are running this in a Hugging Face Space:
- Examine the API: The environment is hosted as a FastAPI server. Use the
/uiendpoint for a visual dashboard. - Inference (LLM Agent): Set
API_BASE_URLandAPI_KEY(e.g., from LiteLLM proxy) then runpython inference.py. - Evaluate: The AI agent creates standardized logs for automated evaluation.
π³ Local Deployment
# Clone and Install
pip install -r requirements.txt
# Run Server (Default port 7860)
python -m server.app
# Run Baseline (Rule-based)
python scripts/baseline_inference.py
# Run LLM Agent (Using API_BASE_URL and API_KEY)
export API_BASE_URL="https://api.openai.com/v1"
export API_KEY="your-key"
python inference.py
Built with β€οΈ for the AI Security community.