Embedded images?

by coder543 - opened Feb 23

Feb 23

Sometimes the markdown output will contain things like ![image](image_1.png), but I don't see any way to know what that should be. Is the model supposed to be outputting coordinates that I can use to crop out sections of the input image? I believe that's how it is supposed to work, but I'm not seeing that with the gguf version using llama-server.

noctrex

Owner Feb 23

I primarily use it only for strictly OCR tasks, so unfortunately I do not know why it produces this output. Maybe this works only on their own harness and not with llama

coder543 changed discussion status to closed Feb 23

coder543 changed discussion status to open Feb 23

coder543

Feb 23

To clarify, this is during the OCR task when the model sees something important that it can’t transcribe, like a chart or an image within a document.

If you don’t know how, that’s okay, I just wondered.

(I accidentally hit close… oops. I was intending to leave this open in case someone else knows.)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment