Text Generation
Transformers
Safetensors
English
qwen3_5_text
oncology
pancreatic-cancer
pdac
clinical-nlp
medical-llm
research
conversational
Instructions to use Joesh1/onca-1.0-9B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Joesh1/onca-1.0-9B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Joesh1/onca-1.0-9B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Joesh1/onca-1.0-9B") model = AutoModelForCausalLM.from_pretrained("Joesh1/onca-1.0-9B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Joesh1/onca-1.0-9B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Joesh1/onca-1.0-9B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.0-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Joesh1/onca-1.0-9B
- SGLang
How to use Joesh1/onca-1.0-9B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Joesh1/onca-1.0-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.0-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Joesh1/onca-1.0-9B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Joesh1/onca-1.0-9B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Joesh1/onca-1.0-9B with Docker Model Runner:
docker model run hf.co/Joesh1/onca-1.0-9B
| license: apache-2.0 | |
| base_model: | |
| - Jackrong/Qwopus3.5-9B-v3 | |
| tags: | |
| - oncology | |
| - pancreatic-cancer | |
| - pdac | |
| - clinical-nlp | |
| - medical-llm | |
| - text-generation | |
| - research | |
| language: | |
| - en | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| <p align="center"> | |
| <img src="./assets/onca-logo-horizontal.svg" alt="Onca logo" width="520"> | |
| </p> | |
| # Onca 1.0 9B | |
| ## Model Summary | |
| Onca 1.0 is an open 9B language model for pancreatic cancer clinical tasks. It is designed for four PDAC-relevant task families: | |
| - clinical trial screening | |
| - case-specific clinical reasoning | |
| - structured pathology report extraction | |
| - molecular variant evidence reasoning | |
| This release is the main FP16/BF16-compatible checkpoint intended as the reference Hugging Face release for the Onca 1.0 model family. | |
| ## Base Model | |
| Onca 1.0 is fine-tuned from `Jackrong/Qwopus3.5-9B-v3`, a Qwen3.5-derived 9B dense reasoning model. The released checkpoint reflects task-focused supervised fine-tuning for pancreatic cancer workflows while preserving the underlying Qwen3.5-class architecture and tokenizer setup. | |
| ## Training Scope | |
| The model was trained on 37,364 prepared rows from openly available sources. The multitask mixture covers: | |
| - trial eligibility screening | |
| - oncology clinical reasoning | |
| - CAP-aligned pathology abstraction | |
| - CIViC-style variant interpretation | |
| The project was built around an open-data, open-weight, single-workstation pipeline so the workflow can be audited and reproduced without private institutional corpora. | |
| ## Intended Use | |
| Onca 1.0 is intended for: | |
| - research on oncology-focused language models | |
| - benchmarking PDAC-oriented clinical NLP workflows | |
| - prototyping structured extraction and screening pipelines | |
| - local experimentation in privacy-sensitive environments | |
| ## Out-of-Scope Use | |
| Onca 1.0 is not intended for: | |
| - direct clinical care | |
| - autonomous treatment recommendations | |
| - unsupervised patient-facing use | |
| - deployment as a validated medical device or diagnostic system | |
| This is a research model and does not replace clinician judgment. | |
| ## Evaluation Summary | |
| In the companion manuscript, Onca 1.0 was evaluated across 11 panels against Woollie-7B, CancerLLM-7B, OpenBioLLM-8B, and the unfine-tuned Qwopus base. Headline results reported in the draft include: | |
| - Trial Screening: 81.6 F1 | |
| - Clinical Reasoning: 14.1 composite | |
| - Pathology Extraction: 30.5 field exact-match | |
| - PubMedQA Cancer: 68.3 macro-F1 | |
| - PubMedQA: 66.5 macro-F1 | |
| The strongest gains appear in workflow-proximal tasks such as trial review and pathology structuring. Variant evidence reasoning remains more difficult than the other task groups. | |
| ## Limitations | |
| - The model is specialized for pancreatic cancer and oncology-adjacent workflows rather than general medicine. | |
| - Training data come from openly available sources rather than private institutional notes, which improves reproducibility but does not fully capture real-world documentation style. | |
| - Benchmark sample sizes for several panels are deliberately limited and should be interpreted with care. | |
| - Performance is uneven across task families and does not imply broad medical competence. | |
| ## Usage | |
| This repository contains the main full-precision checkpoint files. A standard `transformers` loading pattern is: | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| model_id = "Joesh1/onca-1.0-9B" | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype="auto", | |
| device_map="auto", | |
| ) | |
| ``` | |
| Inference formatting should follow the included tokenizer and chat template files in this repository. | |
| ### Quick Chat Helper | |
| ```python | |
| def run_onca(prompt, system_prompt="You are Onca 1.0, a pancreatic-cancer clinical research assistant."): | |
| messages = [ | |
| {"role": "system", "content": system_prompt}, | |
| {"role": "user", "content": prompt}, | |
| ] | |
| text = tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| ) | |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) | |
| with torch.no_grad(): | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=512, | |
| temperature=0.2, | |
| do_sample=False, | |
| ) | |
| completion = outputs[0][inputs["input_ids"].shape[1]:] | |
| return tokenizer.decode(completion, skip_special_tokens=True) | |
| ``` | |
| ### Example 1: Trial Screening | |
| ```python | |
| prompt = """ | |
| Task: Trial eligibility screening for pancreatic cancer. | |
| Patient summary: | |
| - 63-year-old with metastatic PDAC | |
| - Liver metastases present | |
| - ECOG 1 | |
| - Prior gemcitabine plus nab-paclitaxel | |
| - Total bilirubin 0.9 mg/dL | |
| - ANC 2.4 | |
| - Platelets 188 | |
| - No active infection | |
| - No brain metastases | |
| Trial criteria: | |
| - Histologically confirmed metastatic pancreatic adenocarcinoma | |
| - ECOG 0-1 | |
| - Progression after 1 prior systemic regimen | |
| - Adequate marrow and hepatic function | |
| - Exclude uncontrolled infection or CNS metastases | |
| Return: | |
| 1. Eligibility label: eligible / ineligible / unclear | |
| 2. Criterion-by-criterion reasoning | |
| 3. Missing information, if any | |
| """ | |
| print(run_onca(prompt)) | |
| ``` | |
| ### Example 2: Clinical Reasoning | |
| ```python | |
| prompt = """ | |
| Task: Pancreatic cancer clinical reasoning. | |
| Case: | |
| A 58-year-old patient has borderline resectable PDAC in the pancreatic head. | |
| CA19-9 is elevated. ECOG is 0. Germline testing is pending. No distant metastases | |
| are seen on imaging. | |
| Please provide: | |
| 1. A concise assessment | |
| 2. A high-level management plan | |
| 3. Key factors that could change the plan | |
| 4. Important limitations or uncertainties | |
| Do not present this as medical advice. Keep it research-oriented. | |
| """ | |
| print(run_onca(prompt)) | |
| ``` | |
| ### Example 3: Pathology Extraction | |
| ```python | |
| prompt = """ | |
| Task: Structured pathology extraction. | |
| Extract the report into JSON with the following fields: | |
| specimen_type, primary_site, histology, tumor_grade, tumor_size_cm, | |
| margin_status, lymphovascular_invasion, perineural_invasion, | |
| lymph_nodes_examined, lymph_nodes_positive, pT, pN, pM, | |
| ajcc_stage, treatment_effect, tumor_focality, additional_findings | |
| Report: | |
| Whipple resection specimen showing moderately differentiated pancreatic ductal | |
| adenocarcinoma, 3.1 cm, centered in the pancreatic head. Tumor extends into | |
| peripancreatic soft tissue. All margins are negative; closest margin is 0.4 cm | |
| at the uncinate margin. Perineural invasion is present. Lymphovascular invasion | |
| is present. Sixteen lymph nodes examined, 3 positive for metastatic carcinoma. | |
| Pathologic stage: pT2 pN1. No distant metastasis identified in specimen. | |
| """ | |
| print(run_onca(prompt)) | |
| ``` | |
| ### Example 4: Variant Evidence Interpretation | |
| ```python | |
| prompt = """ | |
| Task: Variant evidence reasoning for pancreatic cancer. | |
| Variant: | |
| - Gene: BRCA2 | |
| - Alteration: pathogenic loss-of-function variant | |
| - Tumor type: pancreatic ductal adenocarcinoma | |
| Return a JSON object with: | |
| - gene | |
| - alteration | |
| - disease | |
| - evidence_summary | |
| - therapeutic_implication | |
| - diagnostic_implication | |
| - prognostic_implication | |
| - evidence_direction | |
| - confidence | |
| Keep the answer concise and note uncertainty when evidence is incomplete. | |
| """ | |
| print(run_onca(prompt)) | |
| ``` | |
| ### Prompting Tips | |
| - Ask for a specific output format such as bullet points or JSON. | |
| - For extraction tasks, list the exact fields you want returned. | |
| - For screening tasks, provide both the patient summary and the trial criteria. | |
| - For reasoning tasks, request uncertainties and missing data explicitly. | |
| - Treat outputs as research artifacts that require expert review. | |
| ## Files in This Repository | |
| - `model-00001-of-00004.safetensors` through `model-00004-of-00004.safetensors`: sharded model weights | |
| - `model.safetensors.index.json`: shard index | |
| - `config.json`: model architecture configuration | |
| - `generation_config.json`: default generation settings | |
| - `tokenizer.json` and `tokenizer_config.json`: tokenizer files | |
| - `chat_template.jinja`: chat formatting template | |
| ## Related Variants | |
| Quantized releases are provided separately: | |
| - `JosephKBS/onca-1.0-9B-Int8` | |
| - `JosephKBS/onca-1.0-9B-Int4` | |
| ## License | |
| This release is provided under the Apache 2.0 license. Users should also review the license and usage terms of the upstream base model and any referenced datasets or benchmarks. | |
| ## Citation | |
| If you use Onca 1.0, please cite the accompanying manuscript when publicly available. A temporary reference is: | |
| ```bibtex | |
| @misc{shim2026onca, | |
| title = {Onca: An Open 9B Language Model for Pancreatic Cancer Clinical Tasks}, | |
| author = {Shim, Kwan Bo}, | |
| year = {2026}, | |
| note = {Preprint in preparation} | |
| } | |
| ``` | |
| ## Acknowledgments | |
| This project builds on the work of the Qwen and Qwopus model developers, as well as the many institutions and open-data contributors who created and maintained the public datasets used in training and evaluation. | |