How to run MedGemma 1.5 in Google Colab
#1
by FivePoints - opened
If this code is not working, check if koboldcpp has a new version at https://github.com/LostRuins/koboldcpp/releases
# https://colab.research.google.com/
# Step 0: Set Google Colab Runtime to "T4"
## click "runtime" in the Google Colab menu,
## click "change runtime type"
## select "T4 GPU" and click Save
# Resources:
## https://research.google/blog/next-generation-medical-image-interpretation-with-medgemma-15-and-medical-speech-to-text-with-medasr/
## https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF
## https://github.com/LostRuins/koboldcpp/releases
## https://github.com/LostRuins/koboldcpp/wiki
# Step 1: Check GPU and Download KoboldCPP
print("Checking GPU...")
!nvidia-smi
print("\nDownloading KoboldCPP Linux Binary...")
!wget https://github.com/LostRuins/koboldcpp/releases/download/v1.107.3/koboldcpp-linux-x64 -O koboldcpp
# Make the file executable
!chmod +x koboldcpp
print("\nStep 1 Complete: KoboldCPP engine is ready.")
# Step 2: Download MedGemma 1.5 4B (Q8_0) and Vision Projector
print("Downloading MedGemma 1.5 4B IT (Q8_0)...")
model_url = "https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF/resolve/main/medgemma-1.5-4b-it-Q8_0.gguf"
!wget {model_url} -O medgemma-model.gguf
print("\nDownloading Vision Projector (mmproj)...")
vision_url = "https://huggingface.co/unsloth/medgemma-1.5-4b-it-GGUF/resolve/main/mmproj-BF16.gguf"
!wget {vision_url} -O medgemma-vision.gguf
print("\nStep 2 Complete: All model files downloaded.")
# Step 3: Launch KoboldCPP with Multimodal Support
# --usecuda: Processing on the T4 GPU
# --mmproj: Enables the vision projector for image analysis
# --remotetunnel: Generates the Cloudflare public URL to access the GUI
!./koboldcpp --model medgemma-model.gguf \
--mmproj medgemma-vision.gguf \
--usecuda \
--visionmaxres 2048 \
--remotetunnel
# notes:
# when "Your remote tunnel is ready, please connect to ..." appears, you can run the app -
# if page never loads, it might be a firewall problem.
#
# *** Before chatting configure these settings:
# Go to Kobold CPP Settings
# 1) Configure instruction set to "Gemma 2&3" and
# 2) Settings -> Tokens -> Max Tokens = 4096
# 3) Set the System prompt, I use "You are MedGemma, an expert radiologist. Always analyze images for subtle pathologies."
## or "You are MedGemma, an expert medical analyst, you succinctly answer medical questions."
## or a prompt relevant to your questions.