--- title: HyperBrickCaseOps Agent Guide --- # HyperBrickCaseOps Agent Guide This environment evaluates real-world customer support triage. Agents must classify the ticket, request missing info when required, draft the customer reply, add an internal note, and submit only when the workflow is complete. ## Quick Start (Agent Strategy) Recommended action order: 1. `classify` — set `queue`, `priority`, `issue_type` 2. `request_info` if `required_next_actions` includes it 3. `wait` if the customer follow-up is pending 4. `draft_reply` 5. `add_internal_note` 6. `submit` ## Environment API The environment follows the standard OpenEnv API: - `reset()` -> initial observation - `step(action)` -> next observation, reward, done - `state()` -> internal state snapshot Server entrypoint: - `server.app:app` ## Action Schema Each step takes a typed `SupportDeskAction`: - `operation`: `classify|request_info|draft_reply|add_internal_note|submit|wait` - `queue`: string or null - `priority`: string or null - `issue_type`: string or null - `status`: string or null - `resolution_code`: string or null - `requested_fields`: list of strings - `reply`: string or null - `internal_note`: string or null ## Observation Highlights The observation includes: - `task_id`, `difficulty`, `objective` - `ticket` (customer, tier, region, business impact) - `knowledge_base` (policy snippets) - `case` (current triage state) - `workflow_stage`, `required_next_actions`, `risk_flags` ## Tasks and Difficulty There are 4 tasks with increasing difficulty: - `billing_refund_easy` (easy) - `account_takeover_medium` (medium) - `api_incident_hard` (hard) - `regulated_export_exception_hard` (hard) ## Grading and Reward - Deterministic graders score task completion - Final scores are clamped to `(0.01, 0.99)` - Reward provides dense progress signals across the episode ## Routing Guide (High-Level) - Duplicate charge -> `billing_ops`, `high`, `duplicate_charge` - Suspicious login -> `trust_and_safety`, `urgent`, `account_compromise` - Production 500s -> `platform_engineering`, `urgent`, `production_incident` - Export policy bypass -> `compliance_ops`, `high`, `regulated_exception` ## Required Environment Variables Baseline inference uses: - `API_BASE_URL` - `MODEL_NAME` - `HF_TOKEN` ## Mandatory Stdout Format The inference script must emit exactly: ``` [START] task= env= model= [STEP] step= action= reward=<0.00> done= error= [END] success= steps= score= rewards= ``` Rules: - One `[START]` at episode begin - One `[STEP]` per env step - One `[END]` after episode close - `reward` and `rewards` formatted to 2 decimals - `done`/`success` are lowercase booleans