File size: 1,230 Bytes
36bcbc7
a7bca03
36bcbc7
a7bca03
 
 
 
 
36bcbc7
 
 
 
a7bca03
 
 
36bcbc7
 
 
a7bca03
36bcbc7
a7bca03
36bcbc7
a7bca03
36bcbc7
a7bca03
36bcbc7
a7bca03
36bcbc7
a7bca03
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
name: clinical_trial_auditor
version: "3.0.0"
description: >
  A protocol-aware clinical audit benchmark for OpenEnv. The agent acts as a Senior
  Clinical Data Manager and must read an episode-specific protocol excerpt, audit
  tabular patient records against dynamic eligibility and timing rules, and decide
  whether suspicious subgroup outcomes represent actionable control-arm bias or a
  confounded high-risk cohort.
author: Sumit Saraswat
tags:
  - openenv
  - clinical
  - benchmark
  - protocol-reasoning
  - bias-audit
  - ai-safety
tasks:
  - id: task_easy
    name: Dynamic Eligibility Screening
    difficulty: easy
    description: Read the protocol excerpt for the episode and flag patients whose ages violate the protocol-specific eligibility range.
  - id: task_medium
    name: Protocol Timeline Audit
    difficulty: medium
    description: Audit dynamic age eligibility, death-before-treatment errors, and treatment-start window violations with a Stage IV timing exception.
  - id: task_hard
    name: Equity + Protocol Audit
    difficulty: hard
    description: Audit record-level protocol issues and determine whether control-arm bias is genuinely present or only confounded by a high-risk outreach cohort.