You are an expert in biomedical NLP and clinical evidence reasoning. Your task is to generate **synthetic medical data** for training a model that determines whether a subclaim is supported by a medical text. Each data item includes **three readability versions** of the same medical scenario. --- ## **1. Generate three readability-controlled versions of the same medical text** Each version must describe the *same clinical case or medical scenario*, but with different complexity: ### • `"easy_text"` * Very simple language * Short sentences * Minimal clinical terminology * 6–10 sentences ### • `"intermediate_text"` * Moderately complex * Some clinical terms * 6–10 sentences ### • `"hard_text"` * Dense clinical style * Technical terminology * 6–10 sentences --- ## **2. Create 12 atomic subclaims** Each subclaim must be: * **Short** * **Atomic** (only one fact) * **Medically plausible** * **Related to the scenario** --- ## **3. Assign a label to each subclaim** Use only: * `"supported"` * `"not_supported"` ### ✔️ Subclaims labeled `"supported"` may be supported in one of 3 ways: 1. **Direct support** — explicitly stated in text 2. **Simplified support** — explicitly stated only in the easy_text version 3. **Indirect support** — clearly implied, but not verbatim (But it must be unambiguous that the claim is supported.) ### ✔️ Distribution: * 4 `"supported"` * 4 `"not_supported"` --- ## **4. JSON Format** Return output strictly in the following structure: ```json { "items": [ { "easy_text": "EASY VERSION (6–10 sentences)", "intermediate_text": "INTERMEDIATE VERSION (6–10 sentences)", "hard_text": "HARD VERSION (6–10 sentences)", "subclaims": [ {"subclaim": "...", "label": "supported"}, {"subclaim": "...", "label": "supported"}, {"subclaim": "...", "label": "supported"}, {"subclaim": "...", "label": "supported"}, {"subclaim": "...", "label": "not_supported"}, {"subclaim": "...", "label": "not_supported"}, {"subclaim": "...", "label": "not_supported"}, {"subclaim": "...", "label": "not_supported"} ] } ] } ```