-
Hi @PrasannaVpk From this video, we understand that we can parse the json result file in order to get the predicted words.
However, with a more complicated example (for example, a 2-column PDF document), the parsing of the json result file is not easy. If we do it as showed in the video, it does not give the predicted words in the right order (words are printed by line and not by column). Clearly, we need a script that will be based on coordinates (xmin, ymin, xmax, ymax from the Could you provide this script? Thanks. Note: I guess this script is already coded in the method |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 13 replies
-
Hi @piegu 👋 Just to be clear, the video you mentioned was not created by any of the library authors, so we cannot guarantee that the behaviour will not change from what is shown in it 😅 For 2-columns pages, it is expected behaviour for now! |
Beta Was this translation helpful? Give feedback.
-
Hi @fg-mindee Thanks for your answer. I'm just searching a way to print the text found by DocTR. For example, if I want to print in my notebook the text from an image with Tesseract and OpenCv, I run the following 3 lines:
With DocTR, it looks that the text is in the json result file:
But how to print the words in my notebook? What do you think? |
Beta Was this translation helpful? Give feedback.
-
Thank you @fg-mindee. I did test the In order to test DocTR and Tesseract, I published a blog post and a notebook:
Question: is it possible with DocTR to use a recognition model in another language than English? (for example, o Portuguese) |
Beta Was this translation helpful? Give feedback.
-
Thansk @fg-mindee. I will check these topics. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone ! On my side, I am not getting the blocks nor lines segmentation. Meaning every word is inside a same block and the same line. Is it just me or is it because the feature is not yet implemented ? I am wondering what are your thinking regarding how you will implement this. |
Beta Was this translation helpful? Give feedback.
-
Hi @piegu 👋 Would you mind marking the relevant message as answer for this discussion please? It will help potential future visitors to quickly identify the ins & outs of the topic :) |
Beta Was this translation helpful? Give feedback.
-
For anyone looking for a solution, as mentioned by @charlesmindee earlier, we integrated line aggregation in #537. This should make its way to a release this week, but for now, you will need to install the developer version to enjoy the benefits on the high-level API. It is enabled by default, so the basic usage snippet will work: from doctr.io import DocumentFile
from doctr.models import ocr_predictor
model = ocr_predictor(pretrained=True)
doc = DocumentFile.from_pdf("path/to/your.pdf").as_images()
result = model(doc)
json_result = result.export() Feel free to ask if you have any questions :) |
Beta Was this translation helpful? Give feedback.
For anyone looking for a solution, as mentioned by @charlesmindee earlier, we integrated line aggregation in #537. This should make its way to a release this week, but for now, you will need to install the developer version to enjoy the benefits on the high-level API.
It is enabled by default, so the basic usage snippet will work:
Feel free to ask if you have any questions :)