q8 to q4
#1
by adnanPBI - opened
can you please provide me guide to quantize the deepseek-ocr-q8_0.gguf to deepseek-ocr-q4_0.gguf as I want to run it on my gtx 1050 gpu?
build/bin/llama-quantize gguf_models/deepseek-ai/deepseek-ocr-f16.gguf gguf_models/deepseek-ai/deepseek-ocr-Q4_K_M.gguf Q4_K_M
build/bin/llama-mtmd-cli \
-m gguf_models/deepseek-ai/deepseek-ocr-q4_k_m.gguf \
--mmproj gguf_models/deepseek-ai/mmproj-deepseek-ocr-f16.gguf \
--image tmp/mtmd_test_data/Deepseek-OCR-2510.18234v1_page1.png -p "Free OCR." \
--chat-template deepseek-ocr --temp 0 -n 8192
Thanks for your reply.
Can you provide me iq2 matrix for the deepseek ocr?