Spaces:

lablab-ai-amd-developer-hackathon
/

OncoAgent

Running

App Files Files Community

MaximoLopezChenlo commited on 13 days ago

Commit

eab55f5

verified ·

1 Parent(s): 64d5cdf

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +135 -6

README.md CHANGED Viewed

@@ -1,13 +1,142 @@
 ---
 title: OncoAgent
-emoji: 📉
-colorFrom: indigo
-colorTo: gray
 sdk: gradio
-sdk_version: 6.14.0
-python_version: '3.13'
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 title: OncoAgent
+emoji: 🧬
+colorFrom: red
+colorTo: blue
 sdk: gradio
+sdk_version: 5.31.0
 app_file: app.py
 pinned: false
+license: apache-2.0
+short_description: Multi-Agent Oncology Triage powered by AMD MI300X
 ---
+# 🧬 OncoAgent — Multi-Agent Oncology Triage System
+![ROCm](https://img.shields.io/badge/AMD-ROCm_7.2-ed1c24?logo=amd&logoColor=white)
+![Python](https://img.shields.io/badge/Python-3.10+-3776AB?logo=python&logoColor=white)
+![vLLM](https://img.shields.io/badge/vLLM-PagedAttention-000000?logo=vllm&logoColor=white)
+![LangGraph](https://img.shields.io/badge/Orchestration-LangGraph-FF4F00?logo=langchain&logoColor=white)
+![Gradio](https://img.shields.io/badge/UI-Gradio_6-FF7C00?logo=gradio&logoColor=white)
+> **AMD Developer Hackathon 2026** · Powered by AMD Instinct™ MI300X · ROCm 7.2
+## 🌍 100% Open-Source: Democratizing Oncology
+OncoAgent is proudly 100% open-source. We believe that life-saving clinical intelligence should not be locked behind proprietary APIs. Our solution is designed to:
+- **Guarantee Patient Privacy:** Run locally on AMD MI300X hardware or private clouds, ensuring zero patient data leaves the hospital.
+- **Foster Global Contribution:** Allow medical communities worldwide to easily audit, modify, and contribute to the RAG knowledge base.
+OncoAgent is a state-of-the-art multi-agent clinical triage system designed to combat **unstructured data blindness** in primary care oncology. It leverages a tier-adaptive architecture featuring **Qwen 3.5-9B** (Speed Triage) and **Qwen 3.6-27B** (Deep Reasoning) models. Orchestrated via a sophisticated LangGraph state machine, it provides evidence-based oncological reasoning strictly grounded in NCCN/ESMO clinical guidelines, with built-in human-in-the-loop (HITL) safety gates and a Reflexion-based critic loop.
+---
+## 🏗️ Architecture
+```
+┌────────┐   ┌─────────┐   ┌─────────┐   ┌────────────┐      ┌────────────┐   ┌─────────┐
+│ Router │──▶│Ingestion│──▶│Corrective│──▶│ Specialist │◀────│ Critic     │   │ Formatter│
+│(Triage)│   │ (PHI)   │   │  RAG    │   │ (Qwen 9B/  │     │(Reflexion  │   │(Output)  │
+└────────┘   └─────────┘   └─────────┘   │    27B)    │────▶│ Validation)│   └─────────┘
+    │           │             │          └────────────┘      └────────────┘        ▲
+    │           │             │                 │                   │              │
+    ▼           ▼             ▼                 ▼                   ▼              │
+  ┌───────────────────────────────────────────────────────────────────┐      ┌────────────┐
+  │                           Fallback Node                           │      │ HITL Gate  │
+  └───────────────────────────────────────────────────────────────────┘      │(Acuity Chk)│
+                                                                             └────────────┘
+```
+**Key Components:**
+| Module | Description |
+|--------|-------------|
+| `data_prep/` | Dataset builder: PMC-Patients/OncoCoT → Strict JSONL (Llama 3 chat template) |
+| `rag_engine/` | The "Brain": PyMuPDF extraction, Adaptive Semantic Chunking of NCCN/ESMO PDFs, & ChromaDB + PubMedBERT vectorization. |
+| `agents/` | The "Reasoning": LangGraph multi-agent orchestration (Router → Corrective RAG → Specialist ↔ Critic → HITL Gate). |
+| `ui/` | The "Face": Gradio 6 UI with Glassmorphism for clinical note input, real-time source citations, and reasoning output. |
+---
+## 🧠 Dual-Tier Model Strategy (Qwen)
+To maximize the compute capabilities of the **AMD MI300X**, OncoAgent implements a dynamic **Dual-Tier** routing strategy using the Qwen model family. **Both tiers have been fine-tuned on +200,000 real-world oncological cases covering all major cancer types** (derived from PMC-Patients and OncoCoT datasets) to ensure hyper-specialized medical reasoning:
+- **Tier 1: Qwen 3.5-9B (Speed Triage):** A lightweight, extremely fast model used by the `Router` to assess initial complexity, perform simple triage, and handle low-risk queries.
+- **Tier 2: Qwen 3.6-27B (Deep Reasoning):** The heavy-lifter. Activated for high-complexity clinical cases (e.g., metastasis, multi-mutations). It performs deep reasoning and entailment checks, avoiding confirmation bias through rigorous Reflexion loops.
+---
+## ⚡ Hardware Target
+- **GPU:** AMD Instinct™ MI300X (192GB HBM3)
+- **Software Stack:** ROCm 7.2.x, PyTorch (HIP), vLLM with PagedAttention
+- **Models:** `Qwen/Qwen3.5-9B` (Speed Triage) & `Qwen/Qwen3.6-27B-Instruct` (Deep Reasoning)
+- **Precision:** QLoRA 4-bit NormalFloat4 via `bitsandbytes` (ROCm compatible)
+---
+## 🚀 Quick Start
+```bash
+# 1. Clone and setup
+git clone <repo-url>
+cd OncoAgent
+# 2. Install dependencies
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+# 3. Start Inference Server (vLLM on Docker)
+# This spins up the Qwen models optimized for AMD MI300X via ROCm PagedAttention
+docker run --device /dev/kfd --device /dev/dri -p 8000:8000 rocm/vllm:latest \
+    --model Qwen/Qwen3.6-27B-Instruct --tensor-parallel-size 1
+# 4. Configure environment & Run UI
+cp .env.example .env
+# Set VLLM_API_BASE=http://localhost:8000/v1 in .env
+python -m ui.app
+```
+---
+## 📁 Project Structure
+```
+├── docs/                   # Documentation & research
+│   ├── research/           # Deep Research analysis documents
+│   ├── ADR/                # Architectural Decision Records
+│   ├── oncoagent_master_directive.md
+│   └── antigravity_rules.md
+├── data_prep/              # Dataset preparation (Fase 0)
+├── rag_engine/             # RAG ingestion & retrieval (Fase 0-3)
+├── agents/                 # LangGraph orchestration (Fase 3)
+├── ui/                     # Gradio frontend (Fase 4)
+├── tests/                  # Unit & integration tests
+├── scripts/                # Utility scripts
+├── logs/                   # Paper log & social media log
+├── requirements.txt        # Pinned dependencies
+└── Dockerfile              # HF Spaces deployment
+```
+---
+## 🩺 Safety Guarantees
+- **Reflexion-based Critic Loop:** A dedicated safety node audits the Specialist's output against the RAG context (entailment verification). It forces the Specialist to regenerate its output if it detects ungrounded claims or invented dosages.
+- **Human-In-The-Loop (HITL) Gate:** An acuity-based checkpoint that stops the pipeline for human clinician approval on high-risk cases (e.g., Stage IV + complex mutations).
+- **Corrective RAG:** The system grades retrieved context relevance. If insufficient evidence is found, it safely falls back instead of guessing.
+- **Zero-PHI:** Regex-based PII redaction before any processing
+- **Reproducibility:** Fixed seeds (`torch.manual_seed(42)`) across all ML scripts
+---
+## 📄 License
+This project was built for the AMD Developer Hackathon 2026.
+---
+## 👥 Team
+Built with ❤️ and AMD Instinct MI300X.