You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Pentabrid V12 — Medical Foundation Model (14B)

Built in the UAE. Designed for clinics, not clouds.

Pentabrid V12 is a 14B-parameter medical AI model that outperforms 70B models on 8 out of 9 clinical benchmarks. It runs on a single GPU, fully offline, with zero patient data leaving the facility.

Benchmark Results

Benchmark Pentabrid V12 (14B) Med42-v1 (70B) GPT-4 (~1.8T) Med-PaLM 2 (540B)
MedQA (USMLE) 70.0% 61.5% 78.9% 79.7%
MedMCQA 61.7% 60.9% 69.5% 71.3%
PubMedQA (CoT) 72.1% 75.0% 75.0%
MMLU Clinical Knowledge 90.2% ★⬆ 74.3% 86.0% 88.3%
MMLU Professional Medicine 88.6% 79.8% 93.0% 95.2%
MMLU College Medicine 84.4% ★⬆ 68.8% 76.9% 80.9%
MMLU College Biology 91.7% 84.0% 95.1% 94.4%
MMLU Medical Genetics 90.0% 86.0% 91.0% 90.0%
MMLU Anatomy 83.0% ★⬆ 67.4% 80.0% 77.8%
MMLU Medical Average 88.0% 76.7% 87.0% 87.8%

★ = Beats Med42-v1 (70B) · ⬆ = Also beats GPT-4

Clinical Safety — 96%

Category Score
Overall Safety Rate 96.0% (96/100)
Red Flag Detection 100%
Emergency Recognition 100%
Misinformation Rejection 100%
Boundary & Ethics 100%
Scope of Practice 100%
Drug Interactions 77.8%
Contraindications 71.4%

Evaluated on 100 clinical safety scenarios across 6 categories using MedSafetyBench.

Efficiency Comparison

Model Parameters GPU Required Offline Single GPU MMLU Med Avg
Pentabrid V12 14B 1× RTX 5090 88.0%
Med42-v1 70B 4× A100 76.7%
GPT-4 ~1.8T Cluster 87.0%
Med-PaLM 2 540B TPU Pod 87.8%

Training Details

  • Base model: Qwen3-14B (15.28B parameters)
  • Method: LoRA (r=128, alpha=256), BFloat16 precision
  • Dataset: MIAD-SAIF — 182,654 curriculum-weighted medical examples
  • Sources: MedReason, USMLE reasoning, Davidson's Medicine, Schwartz's Surgery, Katzung's Pharmacology, clinical guidelines, MedMCQA, UWorld
  • Final loss: 0.363
  • Framework: Unsloth 2026.2.1 on NVIDIA A100 80GB

Evaluation Methodology

  • Knowledge benchmarks: EleutherAI lm-eval harness v0.4+, 0-shot, log-likelihood scoring
  • PubMedQA: Chain-of-thought reasoning with automated answer extraction (1,000 samples)
  • Safety: Custom 100-scenario MedSafetyBench with regex-based unsafe pattern detection
  • Competitor sources: Med42 — arXiv:2408.06142; GPT-4 — Nori et al. (2023); Med-PaLM 2 — Singhal et al. (2023)

Intended Use

This model is Layer 0 (medical knowledge foundation) of the SAIF Sense Medical AI system, designed for:

  • Automated cardiometabolic risk surveillance
  • Clinical decision support in primary care
  • Medical coding validation (with upcoming V10 ICD-10 layer)
  • Digital nutritional phenotyping (FDII module)

Target deployment: Offline, single-GPU systems in UAE healthcare facilities under PDPL 2023 compliance.

Limitations

  • This is a research model, not a certified medical device
  • Not intended for autonomous clinical decision-making
  • Drug interaction and contraindication detection needs improvement (addressed in V10)
  • Evaluated primarily on English-language benchmarks
  • Requires clinical validation before deployment

Citation

@misc{pentabrid-v12-2026,
  title={Pentabrid V12: A 14B Medical Foundation Model for Offline Clinical Deployment},
  author={SAIF Sense Medical AI},
  year={2026},
  publisher={Clinical-Reasoning-Hub},
  url={https://huggingface.co/Clinical-Reasoning-Hub/Diagnostic-Reasoning-Q3X14B1}
}
Downloads last month
-
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Clinical-Reasoning-Hub/Diagnostic-Reasoning-Q3X14B1

Finetuned
Qwen/Qwen3-14B
Adapter
(165)
this model

Paper for Clinical-Reasoning-Hub/Diagnostic-Reasoning-Q3X14B1

Evaluation results