Medical Document Understanding — GGUF

GGUF-quantised versions of Sebukpor/medical-document-understanding-v2 for CPU inference via llama.cpp.

Optimised for HuggingFace free-tier CPU spaces (2 vCPU, 16 GB RAM).

Available Quants

File	Size	Use case
`model-Q4_K_M.gguf`	~2.5 GB	✅ Recommended — best quality/size for deployment
`model-Q8_0.gguf`	~4.5 GB	Near-lossless, slower
`model-F16.gguf`	~8.5 GB	Reference, too large for free tier

Quick Start (llama-cpp-python)

from llama_cpp import Llama
from llama_cpp.llama_chat_format import Qwen2VLChatHandler

chat_handler = Qwen2VLChatHandler(
    clip_model_path="mmproj-model-f16.gguf"   # vision encoder (see below)
)

llm = Llama(
    model_path="model-Q4_K_M.gguf",
    chat_handler=chat_handler,
    n_ctx=2048,
    n_threads=2,          # HF free tier has 2 vCPU
    verbose=False,
)

import base64
with open("opd_form.jpg", "rb") as f:
    img_b64 = base64.b64encode(f.read()).decode()

response = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are an expert Medical Transcription AI. Extract all information into structured JSON."
        },
        {
            "role": "user",
            "content": [
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{img_b64}"}},
                {"type": "text", "text": "Extract all information from this medical OPD form into structured JSON."}
            ]
        }
    ],
    max_tokens=1024,
    temperature=0.0,
)
print(response["choices"][0]["message"]["content"])

HF Space Deployment

See the app.py included in this repo for a ready-to-deploy Gradio app.

Base Model

Fine-tuned from Qwen/Qwen3.5-4B on handwritten Indian medical OPD forms.

Downloads last month: 188

GGUF

Model size

4B params

Architecture

qwen35

Hardware compatibility

4-bit

8-bit

16-bit

Sebukpor
/

medical-document-understanding-gguf-v2

Medical Document Understanding — GGUF

Available Quants

Quick Start (llama-cpp-python)

HF Space Deployment

Base Model

Spaces using Sebukpor/medical-document-understanding-gguf-v2 3