Spaces:

yakilee
/

TrialPath

Sleeping

App Files Files Community

TrialPath / CLAUDE.md

yakilee

docs: add lessons learned and cognitive notes to CLAUDE.md

51220b7 2 months ago

preview code

raw

history blame contribute delete

7.21 kB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

TrialPath is an AI-powered clinical trial matching system for NSCLC (Non-Small Cell Lung Cancer) patients. Currently in PoC phase — models, service stubs, and UI with mock data are implemented; live AI integrations are pending.

Core idea: Help patients understand which clinical trials they may qualify for, transform "rejection" into "actionable next steps" via gap analysis.

Architecture

See architecture/overview.md for full architecture diagram, data flow, component details, and implementation status.

5 Components: Streamlit UI → Parlant Orchestrator → MedGemma 4B (extraction) + Gemini 3 Pro (planning) + ClinicalTrials MCP Server (search)

5 Data Contracts (Pydantic v2 in trialpath/models/): PatientProfile, SearchAnchors, TrialCandidate, EligibilityLedger, SearchLog

Project Structure

trialpath/                  # Backend module
  models/                   # 5 Pydantic v2 data contracts (implemented)
  services/                 # 4 service stubs: medgemma, gemini, mcp, parlant
  agent/                    # Parlant journey logic (not yet implemented)
  tests/                    # Backend TDD tests (37+ model, 33 service)
app/                        # Streamlit frontend
  pages/                    # 5-page journey (upload → profile → matching → gaps → summary)
  components/               # 6 reusable widgets
  services/                 # State manager, parlant client, mock data
  tests/                    # Frontend TDD tests (30+ component, 5 page)
tests/                      # Integration tests (18 tests)
architecture/               # Architecture documentation
docs/                       # Design docs and TDD guides

Documents

docs/Trialpath PRD.md — Product requirements, success metrics, HAI-DEF submission plan
docs/TrialPath AI technical design.md — Technical architecture, data contracts, Parlant workflow
docs/tdd-guide-*.md — TDD implementation guides (backend, frontend, data/eval)
architecture/overview.md — Architecture overview, data flow, component status

Tech Stack

Python 3.11+ (Streamlit + Pydantic v2)
Google Gemini 3 Pro (orchestration) — stubbed
MedGemma 4B via Hugging Face endpoint (multimodal extraction) — stubbed
Parlant (agentic workflow engine) — client ready, agent pending
ClinicalTrials MCP Server (ClinicalTrials.gov API v2) — client ready

Success Targets

MedGemma Extraction F1 >= 0.85
Trial Retrieval Recall@50 >= 0.75
Trial Ranking NDCG@10 >= 0.60
Criterion Decision Accuracy >= 0.85
Latency < 15s, Cost < $0.50/session

Scope

Disease: NSCLC only
Data: Synthetic patients only (no real PHI)
Timeline: 3-month PoC

Dev tools

use huggingface cli for model deployment
use uv, ruff, astral ty
use ripgrep for exploring codebase

Commit atomically

Always commit atomically to build a clear git history for the larger dev team

ALWAYS run scripts (bash/tests) in the background

you MUST always run the scripts in background to unblock the main context window;
When using timeout, it must be under 1 minute.

Lessons Learned (from past errors)

Async/Sync: never use asyncio.run() in Streamlit

Streamlit has its own event loop; asyncio.run() will raise RuntimeError: This event loop is already running
Use ThreadPoolExecutor + asyncio.run in a background thread as sync bridge
If a method is declared async, verify the body actually awaits async I/O — don't wrap sync blocking calls in async def without asyncio.to_thread

Mocks must match real implementation

Before writing test mocks, READ the actual service code first
Example: MCP client switched from client.post() to client.stream() but tests still mocked .post() → all tests passed locally but broke on integration
Always verify mock signatures against the real method being called

Python import/path conflicts

Never place an entrypoint file inside a package with the same name (e.g., app/app.py inside app/ package)
Streamlit adds parent dirs to sys.path, creating ambiguous imports

Git hygiene

Always check .gitignore before committing; never commit __pycache__/, .env, or binary files
Use git diff --staged to review before every commit

Test stability

Centralize mock data in conftest.py shared fixtures, not inline per-test
When data contracts change, update fixtures in ONE place

Bash output: prefer dedicated tools

Use Read/Grep/Glob instead of bash pipes for file operations
Keep bash commands simple and single-purpose; complex piped commands risk misreading output
Always read the FULL output of bash commands before drawing conclusions

Cognitive Lessons (avoid repeating these thinking errors)

Know where configs live — don't re-discover every session

ALL env vars and defaults: trialpath/config.py (single source of truth)
Key env vars: GEMINI_API_KEY, GEMINI_MODEL (gemini-3-pro), HF_TOKEN, MEDGEMMA_ENDPOINT_URL, MCP_URL (:3000), PARLANT_URL (:8800), SESSION_COST_BUDGET
MedGemma retry settings: MEDGEMMA_MAX_RETRIES, MEDGEMMA_RETRY_BACKOFF, MEDGEMMA_MAX_WAIT, MEDGEMMA_COLD_START_TIMEOUT
.env file is gitignored — never commit it again (API keys were leaked once in commit 53efc3c)
Config consumers: gemini_planner, medgemma_extractor, mcp_client, parlant_bridge, agent/tools, direct_pipeline

Don't flip-flop on implementation decisions

max_output_tokens was added (65536) to fix truncation, then removed to "use defaults", causing regressions
os.environ.get() inline was refactored to config imports, touching 6+ files each time
LESSON: Make the decision ONCE with reasoning, document it, stick with it

Remember the project's fallback chain

Pipeline has 3-tier fallback: Parlant → direct API (direct_pipeline.py) → mock data
Demo mode bypasses file upload and loads MOCK_PATIENT_PROFILE directly
Don't re-implement fallback logic — it already exists in direct_pipeline.py

Read existing code before writing new code

Service instances were re-created per call in agent/tools.py until caching fix
This pattern (wasteful instantiation) could have been caught by reading the code first
ALWAYS read the file you're about to modify, especially service constructors

Don't lose track of what's stubbed vs real

MedGemma: real HF endpoint wired (with retry/cold-start logic)
Gemini: real API wired (with rate limiting)
MCP/ClinicalTrials: has both MCP client AND direct API fallback
Parlant: client ready, agent journey logic NOT yet implemented
UI: all 5 pages functional with mock data fallback

Centralize shared state — don't scatter it

Streamlit state keys: patient_profile, trial_candidates, eligibility_ledgers, parlant_session_id, parlant_session_active, last_event_offset, journey_state
Test fixtures: centralized in conftest.py (root level), not per-test-file
Mock data: app/services/mock_data.py (single file for all mock objects)