Spaces:
Sleeping
Sleeping
| name: clinical_trial_auditor | |
| version: "3.0.0" | |
| description: > | |
| A protocol-aware clinical audit benchmark for OpenEnv. The agent acts as a Senior | |
| Clinical Data Manager and must read an episode-specific protocol excerpt, audit | |
| tabular patient records against dynamic eligibility and timing rules, and decide | |
| whether suspicious subgroup outcomes represent actionable control-arm bias or a | |
| confounded high-risk cohort. | |
| author: Sumit Saraswat | |
| tags: | |
| - openenv | |
| - clinical | |
| - benchmark | |
| - protocol-reasoning | |
| - bias-audit | |
| - ai-safety | |
| tasks: | |
| - id: task_easy | |
| name: Dynamic Eligibility Screening | |
| difficulty: easy | |
| description: Read the protocol excerpt for the episode and flag patients whose ages violate the protocol-specific eligibility range. | |
| - id: task_medium | |
| name: Protocol Timeline Audit | |
| difficulty: medium | |
| description: Audit dynamic age eligibility, death-before-treatment errors, and treatment-start window violations with a Stage IV timing exception. | |
| - id: task_hard | |
| name: Equity + Protocol Audit | |
| difficulty: hard | |
| description: Audit record-level protocol issues and determine whether control-arm bias is genuinely present or only confounded by a high-risk outreach cohort. | |