Spaces:
Sleeping
Sleeping
| # openenv.yaml | |
| name: Invoice Exception Handler | |
| version: "1.0.0" | |
| description: | | |
| An agent learning environment simulating accounts payable exception handling. | |
| The agent acts as an AP analyst: investigates flagged invoices, applies business | |
| rules, detects fraud signals, makes decisions, and closes cases with an audit trail. | |
| authors: | |
| - name: Moahmmed Yusuf, Nadella Harshith | |
| email: [yusufindian09@gmail.com] [nadellaharshith4@gmail.com] | |
| license: MIT | |
| tasks: | |
| - id: task1_price_variance | |
| name: Price Variance Exception | |
| difficulty: easy | |
| description: | | |
| Office stationery invoice arrives 3.08% above PO. Company tolerance policy | |
| allows +/-2% auto-approval. Agent must detect the variance, verify through | |
| the tolerance rule, confirm verbal approval with procurement, and approve | |
| with a PO amendment request. | |
| max_steps: 18 | |
| optimal_score: 1.0 | |
| min_passing_score: 0.60 | |
| - id: task2_duplicate_tax | |
| name: Duplicate Invoice with Tax Error | |
| difficulty: medium | |
| description: | | |
| Logistics supplier submits INV-2024-891, a duplicate of paid INV-2024-819 | |
| (digit transposition: 891 vs 819). Original invoice had wrong GST rate (15% | |
| vs correct 18%) — company overpaid 3,240 INR. New invoice has correct rate. | |
| Agent must detect the duplicate, identify the tax error in the original, | |
| and partially approve only the 3,240 INR tax correction. | |
| max_steps: 20 | |
| optimal_score: 1.0 | |
| min_passing_score: 0.50 | |
| - id: task3_compound_fraud | |
| name: Compound Fraud Signals | |
| difficulty: hard | |
| description: | | |
| IT equipment supplier invoice with four simultaneous fraud signals: bank | |
| account changed via BEC attack (lookalike email domain), GSTIN belongs to | |
| a different entity, 2 of 15 laptops not yet received, and unit price 8.65% | |
| above PO. Agent must find all signals, use the correct communication channel | |
| (phone, not email — which would contact the fraudster), and escalate to legal | |
| and security. | |
| max_steps: 25 | |
| optimal_score: 1.0 | |
| min_passing_score: 0.40 | |
| observation_space: | |
| type: object | |
| description: EnvironmentState Pydantic model | |
| fields: | |
| task_id: {type: string} | |
| step_number: {type: integer} | |
| case_status: {type: string, enum: [open, in_review, decided, routed, closed]} | |
| purchase_order: {type: object, description: "PO with line items and terms"} | |
| invoice: {type: object, description: "Supplier invoice with line items and tax"} | |
| grn: {type: object, description: "Goods receipt — what actually arrived"} | |
| supplier_master: {type: object, description: "Verified supplier record"} | |
| exception_flag: {type: object, description: "Why the system flagged this invoice"} | |
| inspections: {type: array, description: "Fields the agent has inspected"} | |
| checks_run: {type: array, description: "Validation checks completed"} | |
| queries: {type: array, description: "Internal and supplier queries"} | |
| rules_applied: {type: array, description: "Business rules applied"} | |
| decision: {type: string, nullable: true} | |
| routed_to: {type: array} | |
| available_actions: {type: array} | |
| available_checks: {type: array} | |
| available_rules: {type: array} | |
| knowledge_base: {type: array} | |
| cumulative_reward: {type: number} | |
| action_space: | |
| type: object | |
| description: Action with type and params | |
| actions: | |
| inspect_field: | |
| params: {document: string, field: string} | |
| cross_check: | |
| params: {field: string, doc_a: string, doc_b: string} | |
| run_check: | |
| params: {check_name: string} | |
| query_supplier: | |
| params: {question: string, channel: string} | |
| query_internal: | |
| params: {department: string, question: string} | |
| apply_rule: | |
| params: {rule_id: string} | |
| make_decision: | |
| params: {decision: string, reason: string} | |
| route_to: | |
| params: {team: string, notes: string} | |
| close_case: | |
| params: {summary: string} | |
| reward: | |
| range: [-1.0, 1.0] | |
| description: | | |
| Shaped reward at every step. Relevant inspections: +0.01 to +0.14. | |
| Diagnostics revealing issues: +0.08 to +0.18. Correct fixes: +0.08 to +0.30. | |
| Wrong decision on fraud: -0.15 to -0.40. Repeat actions: -0.02 to -0.05. | |
| SLA breach: -0.10. | |
| grading: | |
| method: task_grader | |
| scores: | |
| - score | |
| - diagnosis_score | |
| - investigation_score | |
| - decision_score | |
| - routing_score | |
| - closure_score | |
| - efficiency_score | |
| api: | |
| reset: | |
| signature: "reset(task_id: str | None = None) -> EnvironmentState" | |
| step: | |
| signature: "step(action: Action | dict) -> StepResult" | |
| state: | |
| signature: "state() -> EnvironmentState" | |
| grade: | |
| signature: "grade() -> Dict[str, float]" | |
| http_endpoints: | |
| - path: /reset | |
| method: POST | |
| description: Reset environment, returns EnvironmentState JSON | |
| - path: /step | |
| method: POST | |
| description: Execute action, returns StepResult JSON | |
| - path: /state | |
| method: GET | |
| description: Current state, returns EnvironmentState JSON | |
| - path: /grade | |
| method: POST | |
| description: Grade current episode | |
| - path: /health | |
| method: GET | |
| description: Health check | |
| dependencies: | |
| python: ">=3.10" | |
| packages: | |
| - pydantic>=2.7 | |
| - fastapi>=0.111 | |
| - uvicorn>=0.29 | |
| - gradio>=4.36 | |
| - openai>=1.35 | |
| - pyyaml>=6.0 | |
| docker: | |
| port: 7860 | |
| health_check: /health | |