HIKARI-Polaris-8B-SkinDx-Oracle
Healthcare-oriented Intelligent Knowledge Augmented Retrieval and Inference
Named after Polaris β the fixed North Star, used as an immovable reference point
π¦ Model Type: Merged Full Model
This is a fully merged model β the LoRA adapter weights have been merged directly into the base model weights.
β No adapter loading needed. Load directly with
transformers,vLLM, orSGLang.πΎ Size: ~17 GB (4 safetensor shards)
β οΈ Research Model β Not for Production
This model requires ground-truth disease group labels at inference time (oracle setting). It is an ablation study model from the HIKARI M-series experiments, designed to measure the theoretical upper bound of the group-context injection approach.
For production use, see:
β E27085921/HIKARI-Sirius-8B-SkinDx-RAG β 85.86% accuracy, no oracle required
Overview
HIKARI-Polaris is the M1 GT-Oracle model from our M-series ablation study. It is trained with ground-truth disease group labels injected into both the training prompts and the inference prompts. This tells us: "if we perfectly know the group at inference time, what is the maximum accuracy achievable by the group-injection approach?"
| Property | Value |
|---|---|
| Task | 10-class skin disease diagnosis β M1 oracle ablation |
| Base model | Qwen/Qwen3-VL-8B-Thinking |
| Context at training | GT group label (oracle) |
| Context at inference | GT group label (oracle) |
| Val accuracy (full benchmark) | 59.38% |
| Val accuracy (ablation oracle) | 66.0% |
| Model type | Merged full model |
M-Series Ablation Summary
| Model | Train Context | Infer Context | Accuracy |
|---|---|---|---|
| M0 | None | None | 57.4% |
| M1 / Polaris (this model) | GT group | GT group | 66.0% (oracle) / 59.38% (full) |
| M2 | GT group | None | 57.4% |
| M3 | Predicted group | Predicted group | 61.0% |
| M4 | Predicted group | None | 61.0% |
| HIKARI-Sirius (RAG-in-Training) | RAG references | RAG references | 85.86% |
Key finding: Even perfect GT-oracle group injection (M1: 66%) is far below RAG-in-Training (Sirius: 85.86%). Group labels alone are insufficient β the model needs visual reference examples to differentiate fine-grained diseases within a group.
π§ Quick Inference β transformers
from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
import torch
from PIL import Image
model_id = "E27085921/HIKARI-Polaris-8B-SkinDx-Oracle"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = Qwen3VLForConditionalGeneration.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
image = Image.open("skin_lesion.jpg").convert("RGB")
# β οΈ Requires GROUND-TRUTH group β for research evaluation only
gt_group = "inflammatory"
PROMPT = (
"This skin lesion belongs to the group '{group}'. "
"Examine the lesion morphology (papules, plaques, macules), "
"color (red, violet, white, brown), scale/crust, border sharpness, "
"and distribution pattern. Based on these visual features, "
"what is the specific skin disease?"
)
messages = [{"role": "user", "content": [
{"type": "image", "image": image},
{"type": "text", "text": PROMPT.format(group=gt_group)},
]}]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=64, temperature=0.0, do_sample=False)
print(processor.batch_decode(out[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)[0].strip())
π Citation
@misc{hikari2026,
title = {HIKARI: RAG-in-Training for Skin Disease Diagnosis
with Cascaded Vision-Language Models},
author = {Watin Promfiy and Pawitra Boonprasart},
year = {2026},
institution = {King Mongkut's Institute of Technology Ladkrabang,
Department of Information Technology, Bangkok, Thailand}
}
Made with β€οΈ at King Mongkut's Institute of Technology Ladkrabang (KMITL)
- Downloads last month
- 4