Questions on OCR Output
Dear Datalab,
First, thank you for the amazing Chandra OCR 2 model, it is absolutely incredible. It is amazing to have an open source model that swings well above its weight and it understands images like no other model!
Second, I do have a few questions, hoping you can direct me to better use your model. We're using several different GGUF versions of your model in a workflow to convert medical lecture slides into faithful markdown, and when it works it’s incredible, especially for dense diagram pages; our recurring issue is output consistency, where useful extraction sometimes lands in reasoning_content instead of content, or we get unstable formatting/runaway responses on certain pages. We’re testing prithivmlmods/chandra-ocr-2 (F32/F16) and mradermacher/chandra-ocr-2 (F16), and would really appreciate guidance on the best inference/template settings for reliable OCR-style output (prompt order for image+text, max token targets, reasoning controls, and any recommended stop strings or chat template/Jinja adjustments) so we can keep Chandra’s visual intelligence while making production output deterministic.
Any suggestions on how to direct the model better?
Thank you for your time,
Bently