File size: 7,710 Bytes
aa166d8
 
eab55f5
 
 
aa166d8
eab55f5
aa166d8
e1624f5
eab55f5
e1624f5
aa166d8
 
eab55f5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e1624f5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
title: OncoAgent
emoji: 🧬
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Multi-Agent Oncology Triage powered by AMD MI300X
---

# 🧬 OncoAgent β€” Multi-Agent Oncology Triage System

![ROCm](https://img.shields.io/badge/AMD-ROCm_7.2-ed1c24?logo=amd&logoColor=white)
![Python](https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white)
![vLLM](https://img.shields.io/badge/vLLM-PagedAttention-000000?logo=vllm&logoColor=white)
![LangGraph](https://img.shields.io/badge/Orchestration-LangGraph-FF4F00?logo=langchain&logoColor=white)
![Gradio](https://img.shields.io/badge/UI-Gradio_6-FF7C00?logo=gradio&logoColor=white)

> **AMD Developer Hackathon 2026** Β· Powered by AMD Instinctβ„’ MI300X Β· ROCm 7.2

## 🌍 100% Open-Source: Democratizing Oncology
OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:
- **Guarantee Patient Privacy:** Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
- **Foster Global Contribution:** Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.

OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat **unstructured data blindness** in primary care oncology. It leverages a tier-adaptive architecture featuring **Qwen 3.5-9B** (Speed Triage) and **Qwen 3.6-27B** (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.

---

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Router │──▢│Ingestion│──▢│Corrective│──▢│ Specialist │◀────│ Critic     β”‚   β”‚ Formatterβ”‚
β”‚(Triage)β”‚   β”‚ (PHI)   β”‚   β”‚  RAG    β”‚   β”‚ (Qwen 9B/  β”‚     β”‚(Reflexion  β”‚   β”‚(Output)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚    27B)    │────▢│ Validation)β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚           β”‚             β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β–²
    β”‚           β”‚             β”‚                 β”‚                   β”‚              β”‚
    β–Ό           β–Ό             β–Ό                 β–Ό                   β–Ό              β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚                           Fallback Node                           β”‚      β”‚ HITL Gate  β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚(Acuity Chk)β”‚
                                                                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**Key Components:**

| Module | Description |
|--------|-------------|
| `data_prep/` | Dataset builder: PMC-Patients/OncoCoT β†’ Strict JSONL (Llama 3 chat template) |
| `rag_engine/` | The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization. |
| `agents/` | The "Reasoning": LangGraph multi-agent orchestration (Router β†’ Corrective RAG β†’ Specialist ↔ Critic β†’ HITL Gate). |
| `ui/` | The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output. |

---

## 🧠 Dual-Tier Model Strategy (Qwen)

To maximize the compute capabilities of the **AMD MI300X**, OncoAgent implements a dynamic **Dual-Tier** routing strategy using the Qwen model family. **Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types** (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:

- **Tier 1: Qwen 3.5-9B (Speed Triage):** A lightweight, extremely fast model used by the `Router` to assess initial complexity, perform simple triage, and handle low-risk queries.
- **Tier 2: Qwen 3.6-27B (Deep Reasoning):** The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.

---

## ⚑ Hardware Target

- **GPU:** AMD Instinctβ„’ MI300X (192GB HBM3)
- **Software Stack:** ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
- **Models:** `Qwen/Qwen3.5-9B` (Speed Triage) & `Qwen/Qwen3.6-27B-Instruct` (Deep Reasoning)
- **Precision:** QLoRA 4-bit NormalFloat4 via `bitsandbytes` (ROCm compatible)

---

## πŸš€ Quick Start

```bash
# 1. Clone and setup
git clone <repo-url>
cd OncoAgent

# 2. Install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# 3. Start Inference Server (vLLM on Docker)
# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
    --model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1

# 4. Configure environment & Run UI
cp .env.example .env
# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
python -m ui.app
```

---

## πŸ“ Project Structure

```
β”œβ”€β”€ docs/                   # Documentation & research
β”‚   β”œβ”€β”€ research/           # Deep Research analysis documents
β”‚   β”œβ”€β”€ ADR/                # Architectural Decision Records
β”‚   β”œβ”€β”€ oncoagent_master_directive.md
β”‚   └── antigravity_rules.md
β”œβ”€β”€ data_prep/              # Dataset preparation (Fase 0)
β”œβ”€β”€ rag_engine/             # RAG ingestion & retrieval (Fase 0-3)
β”œβ”€β”€ agents/                 # LangGraph orchestration (Fase 3)
β”œβ”€β”€ ui/                     # Gradio frontend (Fase 4)
β”œβ”€β”€ tests/                  # Unit & integration tests
β”œβ”€β”€ scripts/                # Utility scripts
β”œβ”€β”€ logs/                   # Paper log & social media log
β”œβ”€β”€ requirements.txt        # Pinned dependencies
└── Dockerfile              # HF Spaces deployment
```

---

## 🩺 Safety Guarantees

- **Reflexion-based Critic Loop:** A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
- **Human-In-The-Loop (HITL) Gate:** An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
- **Corrective RAG:** The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
- **Zero-PHI:** Regex-based PII redaction before any processing
- **Reproducibility:** Fixed seeds (`torch.manual_seed(42)`) across all ML scripts

---

## πŸ“„ License

This project was built for the AMD Developer Hackathon 2026.

---

## πŸ‘₯ Team

Built with ❀️ and AMD Instinct MI300X.