Spaces:

yakilee
/

TrialPath

Sleeping

App Files Files Community

TrialPath / docs /TrialPath AI technical design.md

yakilee

chore: initialize project skeleton with pyproject.toml

1abff4e 2 months ago

preview code

raw

history blame contribute delete

21 kB

Below is a compact but deepened tech design doc that applies your three constraints:

Reuse existing ClinicalTrials MCPs.
Make Parlant workflows map tightly onto real clinical screening.
Lay out a general patient plan (using synthetic data) that feels like a real-world journey.

No code; just user flow, data contracts, and architecture.

1. Scope & Positioning

PoC Goal (2‑week sprint, YAGNI):
A working, demoable patient‑centric trial-matching copilot that:

Takes synthetic NSCLC patients (documents + minimal metadata).
Uses MedGemma 4B multimodal to understand those artifacts.
Uses Gemini 3 Pro + Parlant to orchestrate patient‑to‑trials matching via an off‑the‑shelf ClinicalTrials MCP server.
Produces an eligibility ledger + gap analysis aligned with real clinical screening workflows (prescreen → validation), not “toy” UX.

We explicitly don’t build our own trial MCP, own search stack, or multi-service infra. Everything runs in a thin orchestrator + UI process.

2. Real-World Screening Workflow Mapping

Evidence from clinical practice and trial‑matching research converges on a two‑stage flow:appliedclinicaltrialsonline+4

Prescreening
- Quick eligibility judgment on a minimal dataset: diagnosis, stage, functional status (ECOG), basic labs, key comorbidities.
- Usually: oncologist + coordinator + minimal EHR context.
- Goal: “Is this patient worth deeper chart review for any trials here?”
Validation (Full Match / Chart Review)
- Detailed comparison of full record vs full inclusion/exclusion, often 40–60 criteria per trial.
- Typically done by a coordinator/CRA with investigator sign‑off.
- Goal: for a specific trial, decide: eligible / excluded / unclear → needs further tests.

Our PoC should simulate this two‑stage workflow:

Stage 1 = “Patient‑First Prescreen” → shortlist trials via MCP + Gemini using MedGemma‑extracted “minimal dataset”.
Stage 2 = “Trial‑Specific Validation” → trial‑by‑trial, criterion‑by‑criterion ledger using MedGemma evidence.

Parlant Journeys become the explicit codification of these two stages + transitions.

3. High-Level Architecture (YAGNI, Reusing MCP)

3.1 Components

1) UI & Orchestrator (single process)

Streamlit/FastAPI-style app (exact stack is secondary) that:
- Hosts the chat/stepper UI.
- Embeds Parlant and maintains session state.
- Calls external tools (Gemini API, MedGemma HF endpoint, ClinicalTrials MCP).

2) Parlant Agent + Journey

Single Parlant agent, e.g. patient_trial_copilot.
One Journey with explicit stages mirroring real-world workflow:
- INGEST → PRESCREEN → VALIDATE_TRIALS → GAP_FOLLOWUP → SUMMARY.
Parlant rules enforce:
- When to call which tool.
- When to move from prescreen to validation.
- When to ask the patient (synthetic persona) for more documents.

3) MedGemma 4B Multimodal Service (HF endpoint)

Input: PDF(s) + optional images.
Output: structured PatientProfile + evidence spans (doc/page/region references).
Used twice:
- Once for prescreen dataset extraction.
- Once for criterion‑level validation (patient vs trial snippets).

4) Gemini 3 Pro (LLM Planner & Re‑ranker)

Uses Google AI / Vertex Gemini 3 Pro for:
- Generating query parameters for ClinicalTrials MCP from PatientProfile.
- Interpreting MCP results & producing ranked TrialCandidate list.
- Orchestrating criterion slicing and gap reasoning.
Strategy: keep Gemini in tools + structured outputs mode; no direct free-form “actions”.

5) ClinicalTrials MCP Server (Existing)

Choose an existing ClinicalTrials MCP server rather than hand-rolling: e.g. one of the open-source MCP servers wrapping the ClinicalTrials.gov REST API v2.github+3
Must support at least:
- search_trials(parameters) → list of (NCT ID, title, conditions, locations, status, phase, eligibility text).
- get_trial(nct_id) → full record including inclusion/exclusion criteria.

3.2 Why Reuse MCP is Critical

Time: ClinicalTrials.gov v2 API is detailed and somewhat finicky; paging, filters, field lists. Existing MCPs already encode those details + JSON schemas.nlm.nih+1
Alignment with agentic ecosystems: These MCP servers are already shaped as “tools” for LLMs. We just plug Parlant/Gemini on top.
YAGNI: custom MCP or RAG index for trials is a post‑PoC optimization.

4. Data Contracts (Core JSON Schemas)

We keep contracts minimal but explicit, so we can test each piece in isolation.

4.1 PatientProfile (v1)

Output of MedGemma’s prescreen extraction; updated as new docs arrive:

json
{
"patient_id": "string",
"source_docs": [
{ "doc_id": "string", "type": "clinic_letter|pathology|lab|imaging", "meta": {} }
],
"demographics": {
"age": 52,
"sex": "female"
},
"diagnosis": {
"primary_condition": "Non-Small Cell Lung Cancer",
"histology": "adenocarcinoma",
"stage": "IVa",
"diagnosis_date": "2025-11-15"
},
"performance_status": {
"scale": "ECOG",
"value": 1,
"evidence": [{ "doc_id": "clinic_1", "page": 2, "span_id": "s_17" }]
},
"biomarkers": [
{
"name": "EGFR",
"result": "Exon 19 deletion",
"date": "2026-01-10",
"evidence": [{ "doc_id": "path_egfr", "page": 1, "span_id": "s_3" }]
}
],
"key_labs": [
{
"name": "ANC",
"value": 1.8,
"unit": "10^9/L",
"date": "2026-01-28",
"evidence": [{ "doc_id": "labs_jan", "page": 1, "span_id": "tbl_anc" }]
}
],
"treatments": [
{
"drug_name": "Pembrolizumab",
"start_date": "2024-06-01",
"end_date": "2024-11-30",
"line": 1,
"evidence": [{ "doc_id": "clinic_2", "page": 3, "span_id": "s_45" }]
}
],
"comorbidities": [
{
"name": "CKD",
"grade": "Stage 3",
"evidence": [{ "doc_id": "clinic_1", "page": 2, "span_id": "s_20" }]
}
],
"imaging_summary": [
{
"modality": "MRI brain",
"date": "2026-01-20",
"finding": "Stable 3mm left frontal lesion, no enhancement",
"interpretation": "likely inactive scar",
"certainty": "low|medium|high",
"evidence": [{ "doc_id": "mri_report", "page": 1, "span_id": "s_9" }]
}
],
"unknowns": [
{ "field": "EGFR", "reason": "No clear mention", "importance": "high" }
]
}

Notes:

unknowns is explicit, enabling Parlant to decide what to ask for in GAP_FOLLOWUP.
evidence structure enables later criterion-level ledger to reference the same spans.
This is not a fully normalized EHR; it’s what’s needed for prescreening.pmc.ncbi.nlm.nih+1

4.2 SearchAnchors (v1)

Intermediate structure Gemini produces from PatientProfile to drive the MCP search:

json
{
"condition": "Non-Small Cell Lung Cancer",
"subtype": "adenocarcinoma",
"biomarkers": ["EGFR exon 19 deletion"],
"stage": "IV",
"geography": {
"country": "DE",
"max_distance_km": 200
},
"age": 52,
"performance_status_max": 1,
"trial_filters": {
"recruitment_status": ["Recruiting", "Not yet recruiting"],
"phase": ["Phase 2", "Phase 3"]
},
"relaxation_order": [
"phase",
"distance",
"biomarker_strictness"
]
}

This mirrors patient‑centric matching literature: patient characteristics + geography + site status.nature+1

4.3 TrialCandidate (v1)

Returned by ClinicalTrials MCP search and lightly normalized:

json
{
"nct_id": "NCT01234567",
"title": "Phase 3 Study of Osimertinib in EGFR+ NSCLC",
"conditions": ["NSCLC"],
"phase": "Phase 3",
"status": "Recruiting",
"locations": [
{ "country": "DE", "city": "Berlin" },
{ "country": "DE", "city": "Hamburg" }
],
"age_range": { "min": 18, "max": 75 },
"fingerprint_text": "short concatenation of title + key inclusion/exclusion + keywords",
"eligibility_text": {
"inclusion": "raw inclusion criteria text ...",
"exclusion": "raw exclusion criteria text ..."
}
}

fingerprint_text is purposely short and designed for Gemini reranking; full eligibility goes to MedGemma for criterion analysis.

4.4 EligibilityLedger (v1)

Final artifact per trial, shown to the “clinician” or patient:

json
{
"patient_id": "P001",
"nct_id": "NCT01234567",
"overall_assessment": "likely_eligible|likely_ineligible|uncertain",
"criteria": [
{
"criterion_id": "inc_1",
"type": "inclusion",
"text": "Histologically confirmed NSCLC, stage IIIB/IV",
"decision": "met|not_met|unknown",
"patient_evidence": [{ "doc_id": "clinic_1", "page": 1, "span_id": "s_12" }],
"trial_evidence": [{ "field": "eligibility_text.inclusion", "offset_start": 0, "offset_end": 80 }]
},
{
"criterion_id": "exc_3",
"type": "exclusion",
"text": "No prior treatment with immune checkpoint inhibitors",
"decision": "not_met",
"patient_evidence": [{ "doc_id": "clinic_2", "page": 3, "span_id": "s_45" }],
"trial_evidence": [{ "field": "eligibility_text.exclusion", "offset_start": 211, "offset_end": 280 }]
}
],
"gaps": [
{
"description": "Requires brain MRI within 28 days; last MRI is 45 days old",
"recommended_action": "Repeat brain MRI",
"clinical_importance": "high"
}
]
}

This mirrors TrialGPT’s criterion‑level output (explanation + evidence locations + decision) but tuned to our multimodal extraction and PoC constraints.[nature]

5. Parlant Workflow Design (Aligned with Real Clinical Work)

We design a single Parlant Journey that approximates the real-world job of a trial coordinator/oncologist team, but in a patient‑centric context.pmc.ncbi.nlm.nih+3

5.1 Journey States

States:

INGEST (Document Collection)
PRESCREEN (Patient-Level Trial Shortlist)
VALIDATE_TRIALS (Trial-Level Eligibility Ledger)
GAP_FOLLOWUP (Patient Data Completion Loop)
SUMMARY (Shareable Packet & Next Steps)

State 1 — INGEST

Role in real world: Patient (or referrer) provides records; coordinator checks if enough to do prescreen.trialchoices+2

Inputs:

Uploaded PDFs/images (synthetic in PoC).
Lightweight metadata (age, sex, location) from user form.

Actions:

Parlant calls MedGemma with multimodal input (images + text) to generate PatientProfile.v1.
Parlant agent summarises back to the patient:
- What it understood (“You have stage IV NSCLC, ECOG 1, EGFR unknown”).
- What it is missing (“I did not find EGFR mutation status or recent brain MRI”).

Transitions:

If minimal prescreen dataset is present (diagnosis + stage + ECOG + rough labs): → PRESCREEN.
Else: stays in INGEST but triggers GAP_FOLLOWUP‑style prompts (“Can you upload a pathology report or discharge summary?”).

State 2 — PRESCREEN

Role in real world: Pre‑filter to “worth reviewing” trials based on limited data.pmc.ncbi.nlm.nih+1

Inputs:

PatientProfile.v1.

Actions:

Gemini converts PatientProfile → SearchAnchors.v1.
Parlant calls existing ClinicalTrials MCP with SearchAnchors mapping to MCP’s parameters:
- Condition keywords
- Recruitment status
- Phase filters
- Geography
Trials returned as TrialCandidate list.
Gemini reranks them using fingerprint_text + PatientProfile to produce a shortlist (e.g., top 20).
Parlant communicates to user:
- “Based on your profile, I found 23 potentially relevant NSCLC trials; I’ll now check each more carefully.”

Transitions:

If 0 trials → GAP_FOLLOWUP (relax criteria and/or widen geography).
If >0 trials → VALIDATE_TRIALS.

This maps to patient‑centric matching described in the applied literature: single patient → candidate trials, then deeper evaluation.trec-cds+2

State 3 — VALIDATE_TRIALS

Role in real world: Detailed chart review vs full eligibility criteria.pmc.ncbi.nlm.nih+1

Inputs:

Shortlisted TrialCandidate (e.g., top 10–20).

Actions:

For each trial in shortlist:

Gemini slices inclusion/exclusion text into atomic criteria (each with an ID and text).
For each criterion:
- Parlant calls MedGemma with:
  - PatientProfile + selected patient evidence snippets (and where available, underlying images).
  - Criterion text snippet.
- MedGemma outputs:
  - decision: met/not_met/unknown.
  - patient_evidence span references (doc/page/span_id).
Parlant aggregates per‑trial into EligibilityLedger.v1.

Outputs:

A ranked list of trials with:
- Traffic‑light label (green/yellow/red) for overall eligibility (+ explanation).
- Criterion‑level breakdowns & evidence pointers.

Transitions:

If no trial has any green/yellow (all clearly ineligible):
- GAP_FOLLOWUP to explore whether missing data (e.g., outdated labs) could change this.
Else:
- Offer SUMMARY while keeping GAP_FOLLOWUP open.

State 4 — GAP_FOLLOWUP

Role in real world: Additional tests/data to confirm eligibility (e.g., labs, imaging).pfizerclinicaltrials+2

Inputs:

PatientProfile.unknowns + EligibilityLedger.gaps.

Actions:

Gemini synthesizes the minimal actionable set of missing data:
- E.g., “Most promising trials require: (1) current EGFR mutation status, (2) brain MRI < 28 days old.”
Parlant:
- Poses this to the patient in simple language.
- For PoC, user (you, or script) uploads new synthetic documents representing those tests.
On new upload, we go back through INGEST → update PatientProfile → fast‑path direct to PRESCREEN/VALIDATE_TRIALS.

Transitions:

On new docs → INGEST (update and re‑run).
If user declines or no additional data possible → SUMMARY with clear explanation (“Here’s why current trials don’t fit”).

State 5 — SUMMARY

Role in real world: Coordinator/oncologist summarises findings, shares options, and discusses next steps.pfizerclinicaltrials+2

Inputs:

Final PatientProfile.
Set of EligibilityLedger objects for top trials.
List of gaps.

Actions:

Generate:
- Patient‑friendly summary: 3–5 bullet explanation of matches.
- Clinician packet: aggregated ledger and evidence pointers, referencing doc IDs and trial NCT IDs.
For PoC: show in UI + downloadable JSON/Markdown.

Transitions:

End of Journey.

6. General Patient Plan (Synthetic Data Flow)

We simulate realistic but synthetic patients, and run them through exactly the above journey.

6.1 Synthetic Patient Generation & Formats

Source:

TREC Clinical Trials Track 2021/2022 patient topics (free‑text vignettes) as the ground truth for “what the patient’s story should convey”.trec-cds+3
Synthea or custom scripts to generate structured NSCLC trajectories consistent with those vignettes (for additional fields we want).

Artifacts per patient:

Clinic letter PDF
- Plain text + embedded logo; maybe 1–2 key tables (comorbidities, meds).
Biomarker/pathology PDF
- EGFR/ALK/PD‑L1 etc, with small table or scanned‑like image.
Lab report PDF
- Hematology and chemistry values, with dates.
Imaging report PDF (+ optional illustrative image)
- Brain MRI/CT narrative with lesion description; maybe a low‑res “snapshot” image.

Each artifact is saved with metadata mapping to the underlying TREC topic (so we can label what the “true” conditions/stage/biomarkers are).

6.2 Patient Journey (Narrative)

For each synthetic patient “Anna”:

Pre‑visit (INGEST)
- Anna (or a proxy) uploads her documents to the copilot.
- MedGemma extracts a PatientProfile.
- Parlant confirms: “You have stage IV NSCLC with ECOG 1 and prior pembrolizumab; I don’t see your EGFR mutation test yet.”
Prescreen (PRESCREEN)
- Using SearchAnchors, trials are fetched via ClinicalTrials MCP.
- The system returns, e.g., 30 candidates; after reranking, top 10 are selected for validation.
Trial Validation (VALIDATE_TRIALS)
- For each of top 10, the eligibility ledger is computed.
- System identifies, say, 3 trials with many green criteria but a few unknowns (e.g., recent brain MRI).
Gap‑Driven Iteration (GAP_FOLLOWUP)
- Copilot: “You likely qualify for trial NCT01234567 if you have a brain MRI within the last 28 days. Your last MRI is 45 days ago. If your doctor orders a new MRI and the report shows no active brain metastases, you may qualify. For this PoC, you can upload a ‘new MRI report’ file to simulate this.”
- New synthetic PDF is uploaded; PatientProfile is updated.
Re‑match & Summary (PRESCREEN → VALIDATE_TRIALS → SUMMARY)
- System re‑runs with updated PatientProfile.
- Now 3 trials are “likely eligible”, with red flags on only non‑critical criteria.
- Copilot generates:
  - Patient summary: “Here are three trials that look promising for your situation, and why.”
  - Clinician packet: ledger + evidence pointers that mimic a coordinator’s notes.

This general patient plan is consistent across synthetic cases but parameterized by each TREC topic (e.g. biomarker variant, comorbidity pattern).

7. How This Plan Fixes Earlier Gaps

No custom trial search stack
- We explicitly plug into existing ClinicalTrials MCPs built for LLM agents, aligning with your “don’t reinvent the wheel” constraint and drastically lowering infra risk in 2 weeks.github+2
Parlant used as a real workflow engine, not just a wrapper
- States mirror prescreen vs validation vs gap‑closure described in empirical screening studies and trial‑matching frameworks.appliedclinicaltrialsonline+3
- Parlant becomes the place where you encode “when do we ask a human for more information vs when do we refine a query vs when do we stop?”
Patient plan grounded in real‑world processes
- The synthetic patient journey isn’t just “upload docs → list trials.”
- It follows actual clinical workflows: minimal dataset, prescreen, chart review, additional tests, and finally discussion/summary.trialchoices+3
Minimal, testable contracts
- PatientProfile, SearchAnchors, TrialCandidate, EligibilityLedger together give you:
  - Places to measure MedGemma extraction F1.
  - Places to plug TREC qrels (TrialCandidate → NDCG@10).arxiv+2
- They’re small enough to implement quickly but rich enough to survive PoC → MVP.

Source: https://www.perplexity.ai/search/simulate-as-an-experienced-cto-i6TIXOP9TX.rqA97awuc1Q?sm=d#3