medi422 commited on
Commit
fe25e0b
·
verified ·
1 Parent(s): 4ca60ed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -385
README.md CHANGED
@@ -1,421 +1,85 @@
1
- <div align="center">
2
-
3
- <img src="https://img.shields.io/badge/AMD_Instinct-MI300X-ED1C24?style=for-the-badge&logo=amd&logoColor=white" />
4
- <img src="https://img.shields.io/badge/ROCm-Stack-ED1C24?style=for-the-badge&logo=amd&logoColor=white" />
5
- <img src="https://img.shields.io/badge/vLLM-Inference-6D28D9?style=for-the-badge" />
6
- <img src="https://img.shields.io/badge/Qwen-Multimodal-0EA5E9?style=for-the-badge" />
7
- <img src="https://img.shields.io/badge/FastAPI-0.115-009688?style=for-the-badge&logo=fastapi&logoColor=white" />
8
- <img src="https://img.shields.io/badge/Python-3.12+-3776AB?style=for-the-badge&logo=python&logoColor=white" />
9
-
10
- <br /><br />
11
-
12
- # 🏥 MediAgent
13
-
14
- ### Autonomous Multi-Agent Medical Imaging Analysis System
15
-
16
- **Five specialized AI agents. One radiological verdict. Running entirely on AMD.**
17
-
18
- *AMD Developer Hackathon 2026 · Track: Vision & Multimodal AI*
19
-
20
- <br />
21
-
22
- > Built by **Ramyar** — Security researcher & full-stack developer, Sulaymaniyah, Iraq
23
-
24
- </div>
25
-
26
  ---
27
-
28
- ## What Is MediAgent?
29
-
30
- MediAgent is a production-grade autonomous AI system that analyzes medical images — X-rays, MRI scans, CT scans — through a five-agent pipeline and generates structured, peer-reviewed clinical radiology reports in real time.
31
-
32
- Upload an image. Watch five AI agents execute live. Get a formal radiology report with differential diagnoses, ICD-10 codes, a quality score, and a FHIR R4 export ready for any EMR system.
33
-
34
- **No cloud APIs. No OpenAI. No Nvidia.**
35
- Pure AMD MI300X inference. Local. Private. Fast.
36
-
37
  ---
38
 
39
- ## The Pipeline
40
-
41
- ```
42
- ┌─────────────────────────────────────────────────────────────────────┐
43
- │ IMAGE UPLOAD │
44
- │ PNG / JPG / DICOM (.dcm) — up to 20 MB │
45
- └──────────────────────────┬──────────────────────────────────────────┘
46
-
47
- ┌────────────────┴────────────────┐
48
- │ PARALLEL STAGE │
49
- ▼ ▼
50
- ┌─────────────────┐ ┌─────────────────┐
51
- │ INTAKE AGENT │ │ VISION AGENT │
52
- │ │ │ │
53
- │ • Validates │ │ • Multimodal │
54
- │ image payload │ │ Qwen analysis │
55
- │ • Normalizes │ │ • Anatomical │
56
- │ clinical text │ │ findings │
57
- │ • Extracts │ │ • Severity per │
58
- │ demographics │ │ region │
59
- │ • Safety triage │ │ • Confidence │
60
- │ (16 keywords) │ │ scoring │
61
- │ • Modality hint │ │ • Anomaly flags │
62
- └────────┬────────┘ └────────┬────────┘
63
- └──────────────┬──────────────────┘
64
-
65
-
66
- ┌───────────────────────┐
67
- │ RESEARCH AGENT │
68
- │ │
69
- │ • KB cross-reference │
70
- │ (15 conditions) │
71
- │ • Demographic weight │
72
- │ • Ranked differentials│
73
- │ • ICD-10 codes │
74
- │ • Match probabilities │
75
- └───────────┬───────────┘
76
-
77
-
78
- ┌───────────────────────┐
79
- │ REPORT AGENT │
80
- │ │
81
- │ • ACR/NICE format │
82
- │ • Clinical history │
83
- │ • Technique section │
84
- │ • Findings narrative │
85
- │ • Impression + top Dx │
86
- │ • Recommendations │
87
- └───────────┬───────────┘
88
-
89
-
90
- ┌───────────────────────┐
91
- │ CRITIC AGENT │
92
- │ │
93
- │ • Cross-validates │
94
- │ report vs findings │
95
- │ • Quality score 0-100 │
96
- │ • Uncertainty flags │
97
- │ • Disclaimer enforce │
98
- └───────────┬───────────┘
99
-
100
-
101
- ┌─────────────────────────────────────────────────────────────────────┐
102
- │ FINAL REPORT │
103
- │ Structured JSON · PDF Export · FHIR R4 DiagnosticReport │
104
- └─────────────────────────────────────────────────────────────────────┘
105
- ```
106
-
107
- INTAKE and VISION execute **concurrently** — cutting wall-clock latency by running the two most expensive operations in parallel. Everything downstream sequences after both complete.
108
-
109
- ---
110
-
111
- ## AMD Hardware Stack
112
-
113
- | Component | Technology |
114
- |---|---|
115
- | **GPU** | AMD Instinct MI300X |
116
- | **GPU Software** | ROCm — AMD's open-source GPU compute platform |
117
- | **Inference Server** | vLLM (ROCm build) at `localhost:8000/v1` |
118
- | **Model** | Qwen multimodal — native vision + text |
119
- | **Backend** | FastAPI 0.115 + Uvicorn |
120
- | **Frontend** | Vanilla JS + Tailwind CSS + SSE streaming |
121
-
122
- This project is a direct proof of concept that AMD's ROCm stack is **production-viable for real-world medical AI**. Every inference call — vision analysis, clinical normalization, report synthesis, peer review, post-report chat — runs on AMD MI300X. Zero CUDA dependency. Zero cloud API calls.
123
-
124
- ---
125
-
126
- ## Key Features
127
-
128
- ### 🔴 Real-Time SSE Streaming
129
- Watch the pipeline execute live, agent by agent. Every status transition — WAITING → RUNNING → DONE — streams to the dashboard as it happens via Server-Sent Events. Per-agent runtime counters track exactly how long each step takes.
130
-
131
- ### 👁️ Multimodal Vision Analysis
132
- Qwen processes the raw medical image natively. It returns structured JSON: detected modality, technical quality assessment, per-region findings with anatomical names, radiological descriptions, severity levels (NORMAL / INCIDENTAL / SIGNIFICANT / CRITICAL), confidence scores (0–100), and anomaly flags.
133
-
134
- ### 🔬 Medical Knowledge Base + ICD-10 Mapping
135
- The Research Agent cross-references vision findings against 15 curated clinical conditions spanning pulmonary, neurological, abdominal, musculoskeletal, and vascular pathology. Every differential diagnosis comes with an ICD-10 code, match probability, and a sentence explaining exactly why the condition matches the findings.
136
-
137
- ### 🛡️ Critic Agent QA
138
- Every report goes through a peer-review pass before delivery. The Critic checks that all anomalies from the Vision Agent appear in the report, flags low-confidence findings, assigns a quality score (completeness 30% + accuracy 40% + safety 20% + compliance 10%), and hard-caps the score at 40/100 if a core agent failed.
139
-
140
- ### 🏥 DICOM Support
141
- Upload real `.dcm` files. MediAgent extracts 20+ metadata fields — patient name, study date, institution, modality, body part, KVP, slice thickness, pixel spacing, image dimensions — and pre-populates the intake form automatically. MONOCHROME1 inversion and multi-frame handling included.
142
-
143
- ### 📋 FHIR R4 Export
144
- Every report can be exported as a fully conformant HL7 FHIR R4 DiagnosticReport resource. Includes an inline Patient resource, Observation resources, LOINC and SNOMED CT codes, severity mapping, full report text in `presentedForm`, and custom extensions for AI quality score and pipeline status. Ready to import into Epic, Cerner, or any FHIR-capable EMR.
145
-
146
- ### 💬 Post-Report Clinical Chat
147
- After the report is delivered, a ClinicalAdvisorAgent is available for follow-up questions. It answers in 2–4 sentences with direct reference to the report findings. Qwen's thinking/reasoning mode is explicitly disabled — answers are fast, direct, and clinical.
148
-
149
- ### 🔒 Hard Safety Enforcement
150
- - **16 deterministic safety keywords** — chest pain, stroke symptoms, acute trauma, hemoptysis, sepsis, spinal trauma, and more — trigger urgent flags regardless of LLM output.
151
- - **Age-based alerts** — pediatric (<18) and geriatric (>75) cases are automatically flagged for expert review.
152
- - **Mandatory AI disclaimer** — enforced at two independent layers (Report Agent + Critic Agent) and cannot be bypassed or modified by the LLM.
153
- - **Graceful degradation** — the pipeline produces a report even if individual agents fail, always marking what succeeded and what didn't.
154
-
155
- ### 📄 Client-Side PDF Export
156
- Full radiology report exported as a formatted PDF directly in the browser using jsPDF — severity color banner, all six report sections, DICOM metadata, QA score. No server round-trip needed.
157
-
158
- ---
159
-
160
- ## Agent Architecture
161
-
162
- ### IntakeAgent
163
- Validates the image payload (minimum size, valid base64), applies deterministic safety triage, and normalizes clinical language. For simple inputs under 120 characters it skips the LLM entirely and uses a built-in layman-to-medical term map (22 entries: "can't breathe" → "dyspnea", "lump" → "mass/nodule", "dizzy" → "dizziness/vertigo", etc.). Only calls the LLM for complex clinical narratives with comorbidities or medical history. Falls back cleanly to raw input preservation if the LLM is unavailable.
164
-
165
- ### VisionAgent
166
- Sends the base64 image and clinical context to Qwen at temperature 0.0 with a strict JSON schema enforced via system prompt. Handles malformed enum values from the LLM with safe conversion fallbacks — a single bad field never drops a finding. Tracks token usage and anomaly counts in the output metadata.
167
-
168
- ### ResearchAgent
169
- Pre-filters the knowledge base to only conditions compatible with the detected modality before sending to the LLM — reducing prompt size and improving accuracy. Enforces strict output rules: only conditions from the KB, 2–4 differentials maximum, 5% minimum probability, exact ICD-10 codes, and evidence sentences that actually explain the match.
170
-
171
- ### ReportAgent
172
- Builds a structured prompt with clearly labeled sections — clinical history, imaging technique, findings block, differentials block — and asks the LLM to synthesize them into a formal ACR/NICE radiology report. The disclaimer is overwritten to the exact regulatory string after LLM generation, unconditionally.
173
-
174
- ### CriticAgent
175
- Operates at temperature 0.0 for fully deterministic QA. Receives the draft report and the full pipeline state including raw vision findings. Checks every anomaly is accounted for, flags low-confidence observations, and appends a `[QUALITY ASSESSMENT]` block to the recommendations section with score, issues, and uncertainty warnings.
176
 
177
- ### ClinicalAdvisorAgent
178
- Activated only after report delivery, scoped to the specific report's findings. Strips all Qwen thinking output via multi-layer regex before returning the answer handles `<think>` XML blocks, markdown think fences, and plain-text reasoning preambles.
179
-
180
- ---
181
-
182
- ## LLM Client
183
-
184
- The `LLMClient` wraps the OpenAI Python SDK pointed at the local vLLM endpoint. It handles:
185
-
186
- - Text completions with optional JSON mode enforcement
187
- - Multimodal completions with base64 image injection
188
- - Token-level streaming with an `on_token` callback
189
- - 3-attempt retry loop with 1-second flat backoff
190
- - 90-second timeout per call
191
- - Dual-strategy JSON extraction: direct parse first, then character-by-character brace-matching fallback for responses where the LLM adds conversational padding
192
-
193
- ---
194
-
195
- ## Medical Knowledge Base
196
-
197
- 15 conditions covering the most common radiological findings across all supported modalities:
198
-
199
- | Condition | ICD-10 | Modalities | Severity |
200
- |---|---|---|---|
201
- | Community-Acquired Pneumonia | J18.9 | X-RAY, CT | SIGNIFICANT |
202
- | Cardiogenic Pulmonary Edema | J81.0 | X-RAY, CT | CRITICAL |
203
- | Pleural Effusion | J90 | X-RAY, CT, MRI | SIGNIFICANT |
204
- | Spontaneous Pneumothorax | J93.9 | X-RAY, CT | CRITICAL |
205
- | Intracerebral Hemorrhage | I61.9 | CT, MRI | CRITICAL |
206
- | Ischemic Stroke | I63.9 | CT, MRI | CRITICAL |
207
- | Intracranial Neoplasm | C71.9 | MRI, CT | SIGNIFICANT |
208
- | Abdominal Aortic Aneurysm | I71.4 | CT, MRI | CRITICAL |
209
- | Nephrolithiasis | N20.0 | CT, X-RAY | SIGNIFICANT |
210
- | Small Bowel Obstruction | K56.6 | X-RAY, CT | SIGNIFICANT |
211
- | Long Bone Fracture | S82.902 | X-RAY, CT | SIGNIFICANT |
212
- | Degenerative Joint Disease | M19.90 | X-RAY, MRI | INCIDENTAL |
213
- | Hepatic Steatosis | K76.0 | CT, MRI | INCIDENTAL |
214
- | Herniated Disc | M51.16 | MRI, CT | SIGNIFICANT |
215
- | Pulmonary Nodule | R91.1 | X-RAY, CT | SIGNIFICANT |
216
-
217
- ---
218
-
219
- ## API Reference
220
-
221
- | Method | Endpoint | Description |
222
- |---|---|---|
223
- | `GET` | `/` | Clinical dashboard UI |
224
- | `GET` | `/health` | System health, version, active sessions |
225
- | `GET` | `/metrics/gpu` | Live AMD GPU metrics (util, VRAM, temp, power) |
226
- | `POST` | `/analyze` | Synchronous pipeline → full JSON report |
227
- | `POST` | `/analyze/stream` | Real-time SSE streaming pipeline |
228
- | `GET` | `/status/{report_id}` | Poll live pipeline state |
229
- | `POST` | `/chat/{report_id}` | Post-report clinical Q&A |
230
- | `GET` | `/api/docs` | Swagger UI |
231
- | `GET` | `/api/redoc` | ReDoc UI |
232
-
233
- ### `/analyze/stream` — SSE Event Types
234
-
235
- ```json
236
- // Agent status update (emitted on every state transition)
237
- {"agent": "VISION", "status": "RUNNING"}
238
- {"agent": "VISION", "status": "DONE"}
239
-
240
- // Final report (emitted when pipeline completes)
241
- {"type": "report", "data": {...}, "report_id": "REP-A3F9C2D1B4E7"}
242
-
243
- // Error
244
- {"type": "error", "message": "Pipeline produced no report"}
245
- ```
246
-
247
- ### Form Fields (`/analyze`, `/analyze/stream`)
248
 
249
- | Field | Type | Required | Notes |
250
- |---|---|---|---|
251
- | `image` | File | ✅ | PNG, JPG, or DICOM (.dcm), max 20 MB |
252
- | `symptoms` | string | — | Free-text chief complaint |
253
- | `age` | integer | — | 0–120 |
254
- | `sex` | string | — | `M`, `F`, or `O` |
255
- | `clinical_context` | string | — | Medical history, referral details |
256
 
257
  ---
258
 
259
- ## Data Models
260
-
261
- ```
262
- PatientInput
263
- └── image_base64, symptoms, age, sex, clinical_context
264
-
265
- PipelineState
266
- ├── agent_statuses: {INTAKE, VISION, RESEARCH, REPORT, CRITIC}
267
- ├── intake_output: IntakeOutput
268
- ├── vision_output: VisionOutput
269
- │ └── findings: [VisionFinding, ...]
270
- │ └── anatomical_region, description, severity,
271
- │ confidence, confidence_score, is_anomaly
272
- ├── research_output: ResearchOutput
273
- │ └── differential_diagnoses: [KnowledgeMatch, ...]
274
- │ └── condition_name, match_probability,
275
- │ supporting_evidence, differential_rank, icd10_code
276
- ├── report_draft: ReportSection
277
- │ └── clinical_history, technique, findings, impression,
278
- │ recommendations, disclaimer
279
- └── final_report: FinalReport
280
- └── report_id, patient_metadata, sections, vision_summary,
281
- research_summary, overall_severity, agent_pipeline_status,
282
- generation_timestamp
283
- ```
284
 
285
- ---
286
-
287
- ## Project Structure
288
 
289
- ```
290
- mediagent/
291
- ├── main.py ← FastAPI server, all routes, SSE orchestration
292
- ├── core/
293
- │ ├── llm.py ← LLM client (retry, vision, streaming, JSON extraction)
294
- │ ├── models.py ← All Pydantic v2 data models
295
- │ ├── pipeline.py ← Parallel pipeline orchestrator
296
- │ ├── dicom.py ← DICOM parser (pydicom + numpy + Pillow)
297
- │ └── fhir.py ← FHIR R4 DiagnosticReport builder
298
- ├── agents/
299
- │ ├── intake.py ← Input validation, normalization, safety triage
300
- │ ├── vision.py ← Multimodal image analysis
301
- │ ├── research.py ← KB matching, ICD-10, differential diagnosis
302
- │ ├── report.py ← ACR/NICE radiology report synthesis
303
- │ ├── critic.py ← QA validation, quality scoring
304
- │ └── advisor.py ← Post-report clinical Q&A
305
- ├── static/
306
- │ └── index.html ← Full dashboard (Tailwind + Chart.js + SSE)
307
- ├── requirements.txt
308
- └── .env.example
309
- ```
310
 
311
  ---
312
 
313
- ## Getting Started
314
-
315
- ### Prerequisites
316
-
317
- - Python 3.12+
318
- - vLLM running a Qwen multimodal model on ROCm, accessible at `http://localhost:8000/v1`
319
- - ROCm-compatible AMD GPU (MI300X recommended)
320
-
321
- ### Installation
322
-
323
- ```bash
324
- # Clone the repository
325
- git clone https://github.com/Ramyar2007/mediagent
326
- cd mediagent
327
-
328
- # Install Python dependencies
329
- pip install -r requirements.txt
330
-
331
- # Configure environment
332
- cp .env.example .env
333
- # Edit .env and set LLM_BASE_URL to your vLLM endpoint
334
- ```
335
-
336
- ### Environment Variables
337
-
338
- ```env
339
- LLM_BASE_URL=http://localhost:8000/v1 # vLLM OpenAI-compatible endpoint
340
- LLM_MODEL=/model # Model path served by vLLM
341
- APP_PORT=8090 # Server port
342
- ```
343
-
344
- ### Run
345
 
346
- ```bash
347
- python main.py
348
- ```
349
 
350
- Dashboard available at **http://localhost:8090**
351
 
352
- Swagger docs at **http://localhost:8090/api/docs**
353
 
354
  ---
355
 
356
- ## Dependencies
357
 
358
- | Package | Version | Purpose |
359
- |---|---|---|
360
- | `fastapi` | 0.115.6 | Web framework |
361
- | `uvicorn[standard]` | 0.34.0 | ASGI server |
362
- | `openai` | 1.58.1 | SDK for vLLM OpenAI-compatible API |
363
- | `python-multipart` | 0.0.20 | Multipart form / file upload |
364
- | `pydantic` | 2.10.5 | Data validation and serialization |
365
- | `Pillow` | 11.1.0 | Image processing for DICOM conversion |
366
- | `pydicom` | 2.4.4 | DICOM file parsing and metadata extraction |
367
- | `numpy` | 1.26.4 | Pixel array normalization for DICOM |
368
 
369
- Optional: `amdsmi` Python library used automatically when available for more accurate GPU metrics than the `rocm-smi` CLI fallback.
370
 
371
  ---
372
 
373
- ## Clinical Safety
374
-
375
- MediAgent is built with clinical safety as a first-class concern, not an afterthought.
376
-
377
- **Mandatory disclaimer** — enforced at two independent code layers and cannot be overridden by any LLM output:
378
-
379
- > *"This analysis is AI-generated and must be reviewed by a licensed radiologist before any clinical decisions are made."*
380
-
381
- **Hard safety rules that run deterministically, without LLM involvement:**
382
- - 16 urgent clinical keywords trigger immediate flags before any AI processing
383
- - Pediatric and geriatric age thresholds auto-flag for specialist review
384
- - Quality score is hard-capped at 40/100 if core agents (Vision, Report) fail
385
- - Low-confidence findings are always flagged with confirmatory imaging recommendations
386
- - The disclaimer is re-enforced after every LLM call, unconditionally
387
 
388
- **This system is a decision support tool, not a clinical decision maker.** Every output is intended to assist, not replace, a licensed radiologist.
 
 
 
 
 
 
 
389
 
390
  ---
391
 
392
- ## Dashboard Preview
393
 
394
- The single-page clinical dashboard provides:
395
-
396
- - **Live pipeline panel** real-time agent status cards with per-step runtime counters
397
- - **Analytics tab** severity distribution donut chart, differential diagnosis confidence bar chart, agent timing bar chart — all populated from structured model output
398
- - **Report panel** severity banner, safety flags, all six report sections, finding cards color-coded by severity
399
- - **DICOM metadata card** — study date, institution, modality, body part, technical parameters
400
- - **PDF export** — full formatted report generated client-side
401
- - **Clinical chat** — slide-up Q&A panel backed by the ClinicalAdvisorAgent
402
- - **AMD GPU panel** — live util %, VRAM used/total, temperature, power draw — polling every 3 seconds
403
 
404
  ---
405
 
406
- ## Built For
407
-
408
- **AMD Developer Hackathon 2026**
409
- Track: Vision & Multimodal AI
410
 
411
- This project demonstrates that AMD's ROCm ecosystem is a complete, production-viable alternative for serious AI workloads. Medical imaging analysis — with real multimodal vision, structured clinical reasoning, and standards-compliant output — running fully on AMD MI300X without a single NVIDIA or cloud dependency.
 
412
 
413
  ---
414
 
415
- <div align="center">
416
-
417
- **Built by Ramyar · Sulaymaniyah, Iraq**
418
-
419
- *#AMDDevChallenge · AMD Instinct MI300X · ROCm · vLLM · Qwen*
420
-
421
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - Qwen/Qwen3.6-35B-A3B
7
+ - Qwen/Qwen3.6-27B
8
+ pipeline_tag: image-to-text
9
+ tags:
10
+ - medical
 
11
  ---
12
 
13
+ https://cdn-uploads.huggingface.co/production/uploads/69e8826eb1347b4a2120bea7/-WekpB77IqmwChejTUzxP.mp4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
+ # 🏥 MediAgent
16
+ ### Autonomous Multi-Agent Medical Imaging AnalysisAMD Instinct MI300X
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
+ > **AMD Developer Hackathon 2026 · Vision & Multimodal AI Track**
19
+ > Built by Ramyar — Sulaymaniyah, Iraq
 
 
 
 
 
20
 
21
  ---
22
 
23
+ ## What It Does
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
+ MediAgent runs a 5-agent AI pipeline that analyzes medical images (X-ray, MRI, CT, DICOM) and generates formal radiology reports with differential diagnoses, ICD-10 codes, and FHIR R4 export — entirely on AMD hardware.
 
 
26
 
27
+ **No cloud APIs. No OpenAI. No Nvidia. Pure AMD MI300X + ROCm + vLLM.**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ---
30
 
31
+ ## ⚠️ Demo Mode
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ This Space runs in **demo mode** — the full pipeline UI works and all 5 agents animate live, but no real inference is performed since the AMD Instinct MI300X backend is not available on HuggingFace's free hardware.
 
 
34
 
35
+ **See the video demo for live inference on real AMD hardware.**
36
 
37
+ Live inference requires: AMD Instinct MI300X · ROCm · vLLM · Qwen multimodal
38
 
39
  ---
40
 
41
+ ## The 5-Agent Pipeline
42
 
43
+ | Agent | Role |
44
+ |---|---|
45
+ | **INTAKE** | Validates input, normalizes clinical language, safety triage |
46
+ | **VISION** | Multimodal image analysis via Qwen on AMD MI300X |
47
+ | **RESEARCH** | KB cross-reference, differential diagnoses, ICD-10 codes |
48
+ | **REPORT** | ACR/NICE format radiology report synthesis |
49
+ | **CRITIC** | QA peer-review, quality scoring, disclaimer enforcement |
 
 
 
50
 
51
+ INTAKE + VISION run in **parallel** to minimize latency.
52
 
53
  ---
54
 
55
+ ## Key Features
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
+ - Real-time SSE streaming pipeline with per-agent timers
58
+ - DICOM (.dcm) file support with metadata extraction
59
+ - 15-condition medical knowledge base with ICD-10 mapping
60
+ - FHIR R4 DiagnosticReport export
61
+ - Client-side PDF export
62
+ - Post-report clinical Q&A (ClinicalAdvisorAgent)
63
+ - Live AMD GPU metrics (util, VRAM, temp, power)
64
+ - Hard-enforced clinical safety rules
65
 
66
  ---
67
 
68
+ ## Tech Stack
69
 
70
+ - **GPU:** AMD Instinct MI300X
71
+ - **GPU Software:** ROCm
72
+ - **Inference:** vLLM (ROCm build) + Qwen multimodal
73
+ - **Backend:** FastAPI + Uvicorn
74
+ - **Frontend:** Vanilla JS + Tailwind CSS + Chart.js + SSE
 
 
 
 
75
 
76
  ---
77
 
78
+ ## GitHub
 
 
 
79
 
80
+ Full source code, architecture docs, and README:
81
+ **https://github.com/Ramyar2007/mediagent**
82
 
83
  ---
84
 
85
+ *This system is a decision support tool. All outputs must be reviewed by a licensed radiologist before any clinical decisions are made.*