Hey there! I am planning on scanning a few of my books and I found a really good OCR model that I am running locally through LM Studio. While trying to find a method to “link” the text that the model puts out in the LM Studio and the original scanned page, I came across your software. The integration couldn’t have been easier! After several days of testing AI models and figuring out a good workflow, I am just one step away from making this happen. ^^ Unfortunately the OCR function would simply replace the original image with plain text instead of creating a searchable PDF while preserving the scan. Would be awesome to have this function, after all you already have the option to edit searchable PDFs, which means there is a way to access this second layer beneath the scan/image. If I simply missed something: Please let me know!
I opened a PDF file that was originally created from a .jpg file, went to Edit PDF, then selected AI, and clicked on OCR.
Now the model started running and performed the OCR task
This will result in plain text being put out, replacing the original PDF/scan
So, I ran a test: I opened the same document after having used another program for the OCR part. This way I could, as expected, mark text in the PDF.
On top, after having clicked on “Edit Text” above the highlighted words in picture 4 the complete OCR-text becomes visible and editable.
So what I thought of was, that using the OCR function in the above mentioned way would result in the original PDF/scan being preserved and the OCR-text being added as an “underlying layer” instead of producing a plain text that replaces the original PDF/scan.
I hope the imgur link works. Please feel free to ask further questions, if needed.
Thanks for the screenshots — totally clear now!
Love the idea of keeping the original scan and just adding the OCR text as an invisible searchable layer.
Super useful! We’re discussing it right now and I’ll get back to you soon with an update.
Awesome suggestion, thank you!