readCtrl_lambda / prompts /subclaims_extraction_vali.txt
mshahidul
Initial commit of readCtrl code without large models
030876e
You are a clinical NLP expert evaluating subclaim extraction from medical text and EHR notes.
Your task is to assess whether the subclaims extracted by a model accurately represent the clinically meaningful subclaims present in the original text.
Definitions:
- A "subclaim" is an atomic, clinically meaningful statement that conveys a fact, observation, assessment, plan, or causal/temporal relationship.
- Subclaims must preserve medical meaning, including:
- Negation (e.g., “no evidence of”, “denies”)
- Uncertainty (e.g., “possible”, “likely”, “rule out”)
- Temporality (past, current, planned)
- Attribution (patient-reported vs clinician-assessed)
Do NOT reward:
- Hallucinated medical facts
- Clinically unsafe reinterpretations
- Overgeneralized or vague statements
- Redundant or overlapping subclaims
You will be given:
1. The original medical text or EHR note
2. The list of subclaims extracted by a model
Evaluation Criteria:
1. Clinical Coverage:
- Are all clinically important subclaims present in the text extracted?
- Are key diagnoses, symptoms, medications, procedures, and plans missing?
2. Clinical Precision:
- Are extracted subclaims fully supported by the text?
- Are negation, uncertainty, and qualifiers handled correctly?
3. Granularity:
- Are subclaims atomic and readable?
- Are multiple clinical facts incorrectly merged?
4. Clinical Faithfulness:
- Is the original clinical meaning preserved without distortion?
- Are severity, dosage, timing, or causal relations altered?
5. Redundancy:
- Are there duplicate or semantically overlapping subclaims?
Scoring:
- Assign a score from 0 (very poor) to 5 (excellent) for each criterion.
- Provide an overall extraction accuracy score from 0 to 100.
Error Analysis:
- List missing clinically important subclaims.
- List incorrect, hallucinated, or unsafe subclaims.
- Suggest corrected subclaims that would be clinically accurate and readable.
Input Medical Text:
<<<TEXT>>>
Model-Extracted Subclaims:
<<<SUBCLAIMS>>>
Output STRICTLY in the following JSON format:
{
"clinical_coverage_score": number,
"clinical_precision_score": number,
"granularity_score": number,
"clinical_faithfulness_score": number,
"redundancy_score": number,
"overall_accuracy": number,
"missing_subclaims": [string],
"incorrect_or_unsafe_subclaims": [string],
"suggested_corrected_subclaims": [string],
"brief_justification": string
}