pharma-vigilance / openenv.yaml
modelbuilderhq's picture
Upload folder using huggingface_hub
9ab33d8 verified
spec_version: 1
name: pharma_vigilance_env
display_name: "Pharmacovigilance Signal Detector"
description: >
A real-world OpenEnv environment where an AI agent acts as a pharmacovigilance
analyst. The agent reviews synthetic adverse-event cases, decides whether they
represent known labeled effects, emerging safety signals, or low-value noise,
and recommends the correct operational follow-up. Tasks cover known class
effects, clustered post-marketing signal detection, and confounded
drug-drug-interaction cases that require causal reasoning rather than surface
blame assignment.
type: space
runtime: fastapi
app: server.app:app
port: 7860
tags:
- openenv
- healthcare
- pharmacovigilance
- drug-safety
- reinforcement-learning
action_space:
type: structured
fields:
- name: classification
type: string
values: [new_signal, known_side_effect, noise, duplicate]
description: "Top-level safety classification chosen by the agent"
- name: suspect_drug
type: string
description: "Drug or drug interaction the agent believes is causally responsible"
- name: severity_assessment
type: string
values: [mild, moderate, severe, critical]
description: "Agent-assigned clinical severity for the case"
- name: recommended_action
type: string
values: [escalate, log_and_monitor, dismiss, request_more_info]
description: "Operational pharmacovigilance follow-up decision"
- name: reasoning
type: string
description: "Brief free-text rationale used for partial credit on the hard task"
- name: confidence
type: integer
required: false
description: "Optional analyst confidence from 0 to 100 used for calibration-aware reward shaping"
observation_space:
type: structured
fields:
- name: task_id
type: string
description: "Identifier of the current pharmacovigilance task"
- name: reports
type: array
description: "One or more synthetic adverse-event reports included in the case"
- name: drug_interaction_db
type: object
description: "Hardcoded safety and interaction reference data visible to the agent"
- name: step_number
type: integer
description: "Current step index within the episode"
- name: max_steps
type: integer
description: "Maximum number of steps allowed in the episode"
- name: feedback
type: string
required: false
description: "Human-readable feedback from the previous action"
reward:
min: -0.25
max: 1.0
description: >
Reward is computed over a staged pharmacovigilance decision pipeline:
classification, causal suspect selection, severity assessment, and
operational action. The environment then adds a consistency bonus when the
full chain of sub-decisions is internally coherent, and applies a
calibration-aware confidence adjustment for well-calibrated or dangerously
overconfident answers. A false alarm penalty of -0.10 applies when the
agent escalates a case that is truly noise, and a larger missed-signal
penalty of -0.20 applies when the agent dismisses a true new signal. The
hard task can earn an additional +0.05 reasoning bonus when the
explanation explicitly references the interaction mechanism or therapeutic
drug monitoring clues. Step-level rewards may dip slightly below zero for
clearly unsafe or suboptimal behavior, while final grader scores remain
deterministic and normalized for evaluation.
difficulties:
- easy
- medium
- hard
max_steps: 2
tasks:
- id: known_signal_easy
difficulty: easy
description: >
Review a synthetic single-patient report in which an ACE inhibitor is
followed by persistent dry cough and many similar recent cases already
exist. The correct behavior is to recognize this as a known labeled effect
and recommend routine logging and monitoring rather than escalation.
grader: graders.known_signal_easy_grader
- id: cluster_signal_medium
difficulty: medium
description: >
Review a clustered set of recent post-marketing reports tied to a newly
launched cardiovascular therapy. The reports show symptomatic bradycardia
and near-syncope despite the label lacking rhythm-related warnings. The
agent should detect an emerging signal and escalate.
grader: graders.cluster_signal_medium_grader
- id: confounded_hard
difficulty: hard
description: >
Review a confounded transplant-medicine case in which the reporter blames
the wrong drug. The correct judgment requires identifying a tacrolimus and
voriconazole interaction, recognizing acute kidney injury risk from toxic
exposure, and escalating the case as a clinically serious new signal.
grader: graders.confounded_hard_grader