MedGemma 1.5 GGUF (Optimized for Intel Mac & Metal)

Project Overview

This model is a quantized version of MedGemma 1.5, developed as a core component of the MedGemma Local Triage system for the Kaggle MedGemma Impact Challenge.

The primary goal is to provide a high-performance, 100% offline clinical decision support tool capable of running on consumer-grade hardware (specifically Intel-based Macs) in infrastructure-less environments.

Quantization Details

  • Method: Quantized using llama.cpp
  • Format: GGUF
  • Quantization Level: Q4_K_M (Optimal balance between clinical accuracy and memory efficiency)
  • Target Hardware: Intel Mac with 16GB+ RAM and AMD GPU (Metal enabled)

Performance on Edge Hardware

In our local tests on an Intel Mac (i9/AMD Radeon Pro 5500M 4GB):

  • Inference Speed: ~15 tokens/sec
  • Reasoning Latency: Sub-second response for clinical triage logic.
  • Memory Footprint: Fits comfortably within 16GB RAM constraints.

Intended Use

  • Clinical Triage Assistance: Helping frontline workers prioritize patients based on MSF and US Army protocols.
  • Offline Medical RAG: Serving as the reasoning engine for local vector databases (ChromaDB).
  • Privacy-First Applications: Scenarios where patient data residency is strictly local.

How to Use

  1. Download the .gguf file.
  2. Use with llama.cpp or any GGUF-compatible runner.
  3. For Metal acceleration on Mac:
    ./llama-cli -m medgemma-1.5-q4_k_m.gguf -ngl 99 --color -p "Patient presents with..."
    

Disclaimers

IMPORTANT: This model is for research and educational purposes only as part of the MedGemma Impact Challenge. It is not intended for final clinical diagnosis or as a substitute for professional medical advice, diagnosis, or treatment. Always consult with a qualified healthcare provider.

Acknowledgements

  • Base Model: Google MedGemma 1.5
  • Competition: The MedGemma Impact Challenge (Kaggle)
  • Developer: Edmond Song (Stardust Kei)
Downloads last month
6
GGUF
Model size
4B params
Architecture
gemma3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SDCKei/MedGemma-1.5-GGUF-Q4KM-MacOptimized

Quantized
(33)
this model