How to run MedGemma 1.5 in Google Colab

by FivePoints - opened Feb 23

•

If this code is not working, check if koboldcpp has a new version at https://github.com/LostRuins/koboldcpp/releases

# https://colab.research.google.com/

# Step 0: Set Google Colab Runtime to "T4"
## click "runtime" in the Google Colab menu,
## click "change runtime type"
## select "T4 GPU" and click Save

# Resources:
## https://research.google/blog/next-generation-medical-image-interpretation-with-medgemma-15-and-medical-speech-to-text-with-medasr/
## https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF
## https://github.com/LostRuins/koboldcpp/releases
## https://github.com/LostRuins/koboldcpp/wiki



# Step 1: Check GPU and Download KoboldCPP
print("Checking GPU...")
!nvidia-smi

print("\nDownloading KoboldCPP Linux Binary...")
!wget https://github.com/LostRuins/koboldcpp/releases/download/v1.107.3/koboldcpp-linux-x64 -O koboldcpp
            
# Make the file executable
!chmod +x koboldcpp

print("\nStep 1 Complete: KoboldCPP engine is ready.")



# Step 2: Download MedGemma 1.5 4B (Q8_0) and Vision Projector
print("Downloading MedGemma 1.5 4B IT (Q8_0)...")
model_url = "https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF/resolve/main/medgemma-1.5-4b-it-Q8_0.gguf"
!wget {model_url} -O medgemma-model.gguf

print("\nDownloading Vision Projector (mmproj)...")
vision_url = "https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF/resolve/main/mmproj-BF16.gguf"
!wget {vision_url} -O medgemma-vision.gguf

print("\nStep 2 Complete: All model files downloaded.")


# Step 3: Launch KoboldCPP with Multimodal Support
# --usecuda: Processing on the T4 GPU
# --mmproj: Enables the vision projector for image analysis
# --remotetunnel: Generates the Cloudflare public URL to access the GUI

!./koboldcpp --model medgemma-model.gguf \
             --mmproj medgemma-vision.gguf \
             --usecuda \
             --visionmaxres 2048 \
             --remotetunnel


# notes:
# when "Your remote tunnel is ready, please connect to ..." appears, you can run the app -
# if page never loads, it might be a firewall problem.
#
# *** Before chatting configure these settings:
# Go to Kobold CPP Settings
# 1) Configure instruction set to "Gemma 2&3" and
# 2) Settings -> Tokens -> Max Tokens = 4096
# 3) Set the System prompt, I use "You are MedGemma, an expert radiologist. Always analyze images for subtle pathologies."
## or "You are MedGemma, an expert medical analyst, you succinctly answer medical questions."
## or a prompt  relevant to your  questions.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment