File size: 4,793 Bytes
b48aebe
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
license: apache-2.0
base_model: Qwen/Qwen3.5-9B
tags:
  - oncology
  - medical
  - lora
  - peft
  - qwen3
  - amd
  - rocm
  - mi300x
  - clinical
  - fine-tuned
datasets:
  - MaximoLopezChenlo/OncoAgent-Clinical-266K
language:
  - en
  - es
pipeline_tag: text-generation
library_name: peft
---

# 🧬 OncoAgent v1.0 — 9B (Tier 1)

**QLoRA Fine-tuned LoRA Adapter for Clinical Oncology Triage**

[![AMD](https://img.shields.io/badge/AMD-MI300X-ed1c24?logo=amd&logoColor=white)](https://www.amd.com/en/products/accelerators/instinct/mi300x.html)
[![ROCm](https://img.shields.io/badge/ROCm-7.2-ed1c24)](https://rocm.docs.amd.com/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

> **AMD Developer Hackathon 2026** · Trained on AMD Instinct™ MI300X · ROCm 7.2

## Model Description

OncoAgent v1.0 9B is a **QLoRA fine-tuned LoRA adapter** built on top of [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B), specialized for **clinical oncology triage and treatment recommendation**.

This is the **Tier 1 (fast triage)** model in the OncoAgent multi-agent system, optimized for:
- Rapid cancer type classification and routing
- Clinical entity extraction (symptoms, staging, biomarkers)
- First-pass treatment recommendations based on NCCN/ESMO guidelines

## Training Details

| Parameter | Value |
|---|---|
| **Base Model** | Qwen/Qwen3.5-9B |
| **Method** | QLoRA (4-bit NormalFloat4) |
| **Framework** | Unsloth + PEFT + TRL |
| **Hardware** | AMD Instinct™ MI300X (192GB HBM3) |
| **Software** | ROCm 7.2 · PyTorch 2.3+ |
| **LoRA Rank** | 32 |
| **LoRA Alpha** | 32 |
| **Target Modules** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| **Training Samples** | 240,168 (+ 26,686 eval) |
| **Max Sequence Length** | 2,048 tokens |
| **Batch Size** | 8 (gradient accumulation: 2 → effective: 16) |
| **Learning Rate** | 2e-4 (cosine schedule) |
| **Epochs** | 1 |
| **Precision** | BF16 (native MI300X) |
| **Seed** | 42 (reproducible) |

## Dataset

Trained on [MaximoLopezChenlo/OncoAgent-Clinical-266K](https://huggingface.co/datasets/MaximoLopezChenlo/OncoAgent-Clinical-266K), a curated oncology dataset combining:

- **PMC-Patients** — Real clinical case presentations
- **PubMedQA** — Evidence-based medical Q&A
- **OncoCoT** — Chain-of-thought oncology reasoning (synthetic)
- **NCCN/ESMO Guidelines** — Structured guideline extracts

## Usage

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3.5-9B",
    device_map="auto",
    torch_dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-9B")

# Load LoRA adapter
model = PeftModel.from_pretrained(
    base_model,
    "MaximoLopezChenlo/OncoAgent-v1.0-9B",
)

# Inference
messages = [
    {"role": "system", "content": "You are a clinical oncology specialist."},
    {"role": "user", "content": "55yo female, Grade 1 endometrioid adenocarcinoma..."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## vLLM Deployment (AMD MI300X)

```bash
# Serve with vLLM on ROCm
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen3.5-9B \
    --enable-lora \
    --lora-modules oncoagent=MaximoLopezChenlo/OncoAgent-v1.0-9B \
    --dtype bfloat16 \
    --tensor-parallel-size 1 \
    --gpu-memory-utilization 0.45
```

## Architecture

OncoAgent v1.0 9B serves as the **Tier 1** model in a dual-tier architecture:

```
Clinical Case → Router → [Tier 1: 9B] → Specialist → Critic → Output

              (Complex cases)

              [Tier 2: 27B] → Specialist → Critic → Output
```

## Links

- 🔗 **Demo:** [HF Space](https://huggingface.co/spaces/MaximoLopezChenlo/OncoAgent)
- 🔗 **GitHub:** [maximolopezchenlo-lab/OncoAgent](https://github.com/maximolopezchenlo-lab/OncoAgent)
- 🔗 **Tier 2 Model:** [OncoAgent-v1.0-27B](https://huggingface.co/MaximoLopezChenlo/OncoAgent-v1.0-27B)
- 🔗 **Dataset:** [OncoAgent-Clinical-266K](https://huggingface.co/datasets/MaximoLopezChenlo/OncoAgent-Clinical-266K)

## Citation

```bibtex
@misc{oncoagent2026,
  title={OncoAgent: Multi-Agent Oncology Triage System},
  author={Lopez Chenlo, Maximo},
  year={2026},
  howpublished={AMD Developer Hackathon 2026},
  url={https://github.com/maximolopezchenlo-lab/OncoAgent}
}
```

## License

Apache 2.0 — This adapter is for **research and educational purposes only**. Not intended for direct clinical use without professional medical oversight.