I really enjoyed this write-up. It’s great to see OCR workflows becoming more practical with llama.cpp and GGML.
One thing I felt was still a bit “missing” in the ecosystem is a lightweight, glue-style tool that makes it easy to actually use these OCR models in a real workflow (batching, preprocessing, piping results, etc.).
I’ve been working on something along those lines:
https://github.com/haschka/ocr_tool
The goal is to keep everything local and simple, while making OCR pipelines easier to integrate into existing setups (especially when combined with llama.cpp-based models).
Curious if others here have been building similar tooling or workflows - would love to hear how people are stitching these components together in practice.