Text Generation
PEFT
English
Spanish
oncology
medical
lora
qwen3
amd
rocm
mi300x
clinical
fine-tuned
Instructions to use lablab-ai-amd-developer-hackathon/OncoAgent-v1.0-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use lablab-ai-amd-developer-hackathon/OncoAgent-v1.0-9B with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
File size: 4,793 Bytes
b48aebe | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | ---
license: apache-2.0
base_model: Qwen/Qwen3.5-9B
tags:
- oncology
- medical
- lora
- peft
- qwen3
- amd
- rocm
- mi300x
- clinical
- fine-tuned
datasets:
- MaximoLopezChenlo/OncoAgent-Clinical-266K
language:
- en
- es
pipeline_tag: text-generation
library_name: peft
---
# 🧬 OncoAgent v1.0 — 9B (Tier 1)
**QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage**
[](https://www.amd.com/en/products/accelerators/instinct/mi300x.html)
[](https://rocm.docs.amd.com/)
[](https://opensource.org/licenses/Apache-2.0)
> **AMD Developer Hackathon 2026** · Trained on AMD Instinct™ MI300X · ROCm 7.2
## Model Description
OncoAgent v1.0 9B is a **QLoRA fine-tuned LoRA adapter** built on top of [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B), specialized for **clinical oncology triage and treatment recommendation**.
This is the **Tier 1 (fast triage)** model in the OncoAgent multi-agent system, optimized for:
- Rapid cancer type classification and routing
- Clinical entity extraction (symptoms, staging, biomarkers)
- First-pass treatment recommendations based on NCCN/ESMO guidelines
## Training Details
| Parameter | Value |
|---|---|
| **Base Model** | Qwen/Qwen3.5-9B |
| **Method** | QLoRA (4-bit NormalFloat4) |
| **Framework** | Unsloth + PEFT + TRL |
| **Hardware** | AMD Instinct™ MI300X (192GB HBM3) |
| **Software** | ROCm 7.2 · PyTorch 2.3+ |
| **LoRA Rank** | 32 |
| **LoRA Alpha** | 32 |
| **Target Modules** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| **Training Samples** | 240,168 (+ 26,686 eval) |
| **Max Sequence Length** | 2,048 tokens |
| **Batch Size** | 8 (gradient accumulation: 2 → effective: 16) |
| **Learning Rate** | 2e-4 (cosine schedule) |
| **Epochs** | 1 |
| **Precision** | BF16 (native MI300X) |
| **Seed** | 42 (reproducible) |
## Dataset
Trained on [MaximoLopezChenlo/OncoAgent-Clinical-266K](https://huggingface.co/datasets/MaximoLopezChenlo/OncoAgent-Clinical-266K), a curated oncology dataset combining:
- **PMC-Patients** — Real clinical case presentations
- **PubMedQA** — Evidence-based medical Q&A
- **OncoCoT** — Chain-of-thought oncology reasoning (synthetic)
- **NCCN/ESMO Guidelines** — Structured guideline extracts
## Usage
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.5-9B",
device_map="auto",
torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")
# Load LoRA adapter
model = PeftModel.from_pretrained(
base_model,
"MaximoLopezChenlo/OncoAgent-v1.0-9B",
)
# Inference
messages = [
{"role": "system", "content": "You are a clinical oncology specialist."},
{"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## vLLM Deployment (AMD MI300X)
```bash
# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen3.5-9B \
--enable-lora \
--lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
--dtype bfloat16 \
--tensor-parallel-size 1 \
--gpu-memory-utilization 0.45
```
## Architecture
OncoAgent v1.0 9B serves as the **Tier 1** model in a dual-tier architecture:
```
Clinical Case → Router → [Tier 1: 9B] → Specialist → Critic → Output
↓
(Complex cases)
↓
[Tier 2: 27B] → Specialist → Critic → Output
```
## Links
- 🔗 **Demo:** [HF Space](https://huggingface.co/spaces/MaximoLopezChenlo/OncoAgent)
- 🔗 **GitHub:** [maximolopezchenlo-lab/OncoAgent](https://github.com/maximolopezchenlo-lab/OncoAgent)
- 🔗 **Tier 2 Model:** [OncoAgent-v1.0-27B](https://huggingface.co/MaximoLopezChenlo/OncoAgent-v1.0-27B)
- 🔗 **Dataset:** [OncoAgent-Clinical-266K](https://huggingface.co/datasets/MaximoLopezChenlo/OncoAgent-Clinical-266K)
## Citation
```bibtex
@misc{oncoagent2026,
title={OncoAgent: Multi-Agent Oncology Triage System},
author={Lopez Chenlo, Maximo},
year={2026},
howpublished={AMD Developer Hackathon 2026},
url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}
```
## License
Apache 2.0 — This adapter is for **research and educational purposes only**. Not intended for direct clinical use without professional medical oversight.
|