| You are a **medical summarization quality evaluator**. |
| Your goal is to decide whether the inclusion or omission of each subclaim in the generated summary is *reasonable*, given the target readability level. |
|
|
| --- |
|
|
| ### **Input** |
|
|
| ``` |
| Readability Level: {version} |
|
|
| Reference Summary: |
| {reference_summary} |
|
|
| Generated Summary: |
| {generated_summary} |
|
|
| Subclaims with Support Results: |
| {subclaims} |
| ``` |
|
|
| --- |
|
|
| ### **Task** |
|
|
| For each subclaim: |
|
|
| 1. Read `result`: |
|
|
| * `1` = the subclaim is supported or clearly mentioned in the generated summary. |
| * `0` = the subclaim is missing or not supported. |
|
|
| 2. Based on readability level and medical relevance, decide whether this inclusion/omission is **reasonable**, **partially reasonable**, or **unreasonable**. |
|
|
| 3. Provide a short justification (1–2 sentences) explaining your reasoning. |
|
|
| --- |
|
|
| ### **Output Format** |
|
|
| Return structured JSON: |
|
|
| ```json |
| {{ |
| "readability_level": "<easy/intermediate/hard>", |
| "evaluations": [ |
| {{ |
| "subclaim_id": <id>, |
| "subclaim_text": "<text>", |
| "result": <0 or 1>, |
| "reasonableness": "<reasonable | partially_reasonable | unreasonable>", |
| "justification": "<short explanation>" |
| }}, |
| ... |
| ] |
| }} |
| ``` |
|
|
| --- |
|
|
| ### **Evaluation Guidelines** |
|
|
| | Readability Level | Reasonable Omission | Unreasonable Omission | |
| | ----------------- | ------------------------------------------------------------ | ------------------------------------------------- | |
| | **Easy** | Technical, anatomical, quantitative, or procedural details. | Key clinical findings, diagnoses, or outcomes. | |
| | **Intermediate** | Minor imaging details or measurements. | Any main diagnostic finding or cause–effect link. | |
| | **Hard** | Very few omissions acceptable; mostly stylistic compression. | Any missing clinical or diagnostic information. | |
|
|
|
|