File size: 7,710 Bytes
aa166d8 eab55f5 aa166d8 eab55f5 aa166d8 e1624f5 eab55f5 e1624f5 aa166d8 eab55f5 e1624f5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | ---
title: OncoAgent
emoji: π§¬
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 5.31.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Multi-Agent Oncology Triage powered by AMD MI300X
---
# 𧬠OncoAgent β Multi-Agent Oncology Triage System





> **AMD Developer Hackathon 2026** Β· Powered by AMD Instinctβ’ MI300X Β· ROCm 7.2
## π 100% Open-Source: Democratizing Oncology
OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:
- **Guarantee Patient Privacy:** Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
- **Foster Global Contribution:** Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.
OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat **unstructured data blindness** in primary care oncology. It leverages a tier-adaptive architecture featuring **Qwen 3.5-9B** (Speed Triage) and **Qwen 3.6-27B** (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.
---
## ποΈ Architecture
```
ββββββββββ βββββββββββ βββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββ
β Router ββββΆβIngestionββββΆβCorrectiveββββΆβ Specialist βββββββ Critic β β Formatterβ
β(Triage)β β (PHI) β β RAG β β (Qwen 9B/ β β(Reflexion β β(Output) β
ββββββββββ βββββββββββ βββββββββββ β 27B) ββββββΆβ Validation)β βββββββββββ
β β β ββββββββββββββ ββββββββββββββ β²
β β β β β β
βΌ βΌ βΌ βΌ βΌ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ ββββββββββββββ
β Fallback Node β β HITL Gate β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β(Acuity Chk)β
ββββββββββββββ
```
**Key Components:**
| Module | Description |
|--------|-------------|
| `data_prep/` | Dataset builder: PMC-Patients/OncoCoT β Strict JSONL (Llama 3 chat template) |
| `rag_engine/` | The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization. |
| `agents/` | The "Reasoning": LangGraph multi-agent orchestration (Router β Corrective RAG β Specialist β Critic β HITL Gate). |
| `ui/` | The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output. |
---
## π§ Dual-Tier Model Strategy (Qwen)
To maximize the compute capabilities of the **AMD MI300X**, OncoAgent implements a dynamic **Dual-Tier** routing strategy using the Qwen model family. **Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types** (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:
- **Tier 1: Qwen 3.5-9B (Speed Triage):** A lightweight, extremely fast model used by the `Router` to assess initial complexity, perform simple triage, and handle low-risk queries.
- **Tier 2: Qwen 3.6-27B (Deep Reasoning):** The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.
---
## β‘ Hardware Target
- **GPU:** AMD Instinctβ’ MI300X (192GB HBM3)
- **Software Stack:** ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
- **Models:** `Qwen/Qwen3.5-9B` (Speed Triage) & `Qwen/Qwen3.6-27B-Instruct` (Deep Reasoning)
- **Precision:** QLoRA 4-bit NormalFloat4 via `bitsandbytes` (ROCm compatible)
---
## π Quick Start
```bash
# 1. Clone and setup
git clone <repo-url>
cd OncoAgent
# 2. Install dependencies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# 3. Start Inference Server (vLLM on Docker)
# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
--model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1
# 4. Configure environment & Run UI
cp .env.example .env
# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
python -m ui.app
```
---
## π Project Structure
```
βββ docs/ # Documentation & research
β βββ research/ # Deep Research analysis documents
β βββ ADR/ # Architectural Decision Records
β βββ oncoagent_master_directive.md
β βββ antigravity_rules.md
βββ data_prep/ # Dataset preparation (Fase 0)
βββ rag_engine/ # RAG ingestion & retrieval (Fase 0-3)
βββ agents/ # LangGraph orchestration (Fase 3)
βββ ui/ # Gradio frontend (Fase 4)
βββ tests/ # Unit & integration tests
βββ scripts/ # Utility scripts
βββ logs/ # Paper log & social media log
βββ requirements.txt # Pinned dependencies
βββ Dockerfile # HF Spaces deployment
```
---
## π©Ί Safety Guarantees
- **Reflexion-based Critic Loop:** A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
- **Human-In-The-Loop (HITL) Gate:** An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
- **Corrective RAG:** The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
- **Zero-PHI:** Regex-based PII redaction before any processing
- **Reproducibility:** Fixed seeds (`torch.manual_seed(42)`) across all ML scripts
---
## π License
This project was built for the AMD Developer Hackathon 2026.
---
## π₯ Team
Built with β€οΈ and AMD Instinct MI300X.
|