MedGemma 1.5 GGUF (Optimized for Intel Mac & Metal)

Project Overview

This model is a quantized version of MedGemma 1.5, developed as a core component of the MedGemma Local Triage system for the Kaggle MedGemma Impact Challenge.

The primary goal is to provide a high-performance, 100% offline clinical decision support tool capable of running on consumer-grade hardware (specifically Intel-based Macs) in infrastructure-less environments.

Quantization Details

Method: Quantized using llama.cpp
Format: GGUF
Quantization Level: Q4_K_M (Optimal balance between clinical accuracy and memory efficiency)
Target Hardware: Intel Mac with 16GB+ RAM and AMD GPU (Metal enabled)

Performance on Edge Hardware

In our local tests on an Intel Mac (i9/AMD Radeon Pro 5500M 4GB):

Inference Speed: ~15 tokens/sec
Reasoning Latency: Sub-second response for clinical triage logic.
Memory Footprint: Fits comfortably within 16GB RAM constraints.

Intended Use

Clinical Triage Assistance: Helping frontline workers prioritize patients based on MSF and US Army protocols.
Offline Medical RAG: Serving as the reasoning engine for local vector databases (ChromaDB).
Privacy-First Applications: Scenarios where patient data residency is strictly local.

How to Use

Download the .gguf file.
Use with llama.cpp or any GGUF-compatible runner.

For Metal acceleration on Mac:

./llama-cli -m medgemma-1.5-q4_k_m.gguf -ngl 99 --color -p "Patient presents with..."

Disclaimers

IMPORTANT: This model is for research and educational purposes only as part of the MedGemma Impact Challenge. It is not intended for final clinical diagnosis or as a substitute for professional medical advice, diagnosis, or treatment. Always consult with a qualified healthcare provider.

Acknowledgements

Base Model: Google MedGemma 1.5
Competition: The MedGemma Impact Challenge (Kaggle)
Developer: Edmond Song (Stardust Kei)

Downloads last month: 6

GGUF

Model size

4B params

Architecture

gemma3

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SDCKei/MedGemma-1.5-GGUF-Q4KM-MacOptimized

Base model

google/medgemma-1.5-4b-it

Quantized

(33)

this model