MEDIC-Benchmark

Running

Submission Inquiry — 8B Clinical Reasoning Model (76.4% avg across 7 benchmarks)

Feb 21

Hi M42 AI team 👋

I'm Dr Adnan Agha from UAEU (College of Medicine, Al Ain). We've just released an 8B clinical reasoning model that I think might be of interest for the MEDIC Benchmark.

Quick numbers (all zero-shot, lm-eval-harness v0.4.11):

MedQA (USMLE): 66.3%
Professional Medicine: 89.7%
Clinical Knowledge: 86.4%
Medical Genetics: 88.0%
Anatomy: 79.3%
PubMedQA: 66.6%
MedMCQA: 58.6%
Average: 76.4% (+24.2pp over Qwen3-8B base)
The key difference from other medical fine-tunes is the training methodology — we use evidence-based Bayesian clinical reasoning with likelihood ratios rather than standard instruction tuning. The model reasons through differentials the way a clinician would, using tags for transparent step-by-step analysis.

The model is here: Clinical-Reasoning-Hub/Diagnostic-Reasoning-QW3X2 (gated, IP-protected via UAEU, but happy to provide access for benchmarking).

Would love to:

Submit the model for evaluation on MEDIC if you're accepting entries
Connect with the team — we're also based in the UAE so practically neighbors
Always great to see fellow UAE teams pushing the boundaries of medical AI. Happy to share more details or jump on a call.

Cheers,
Adnan

cchristophe

M42 Health org Feb 24

Hi Dr. Adnan,

We would be happy to submit your model to MEDIC. Once I have access, I'll run it through most of the task that we have available.
We are definitely open to collaboration on medical AI. Feel free to send me an email at cchristophe@m42.ae

Kind Regards,
Clément