Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add option to display the number of the annotation #62

Open
mpkopec opened this issue Oct 28, 2022 · 3 comments
Open

Feature: Add option to display the number of the annotation #62

mpkopec opened this issue Oct 28, 2022 · 3 comments

Comments

@mpkopec
Copy link

mpkopec commented Oct 28, 2022

Hello,
First off, I love the script! It produces beautiful output with context, etc. Great job!

For scientific reviews, since you need to provide rebuttal to each of the reviewers' notes, it would be convenient to respond to, e.g. 'page 3, note 12' or 'note 32'. I would suggest labeling the annotations not only with the page number, but also the number of the note on this page and/or global number (the note count would start at the beginning of the document then).

What do you think?

@0xabu
Copy link
Owner

0xabu commented Oct 28, 2022

Thanks for the feedback! It sounds like a reasonable idea. My main concern is that the "order" of notes on a page is unspecified in the PDF file. The order that pdfannots outputs annotations is just based on an ugly pile of heuristics in pdfminer that try to infer the page reading order, and another pile of heuristics in pdfannots that try to determine the text "nearest" an annotation in order to sort them. So, even though "page 4, note 3" may seem logical to you, there are plenty of instances where it won't match what a human would consider the third note on the page (this happens most often on multi-column documents). Until now, these discrepancies didn't matter too much -- sometimes notes show up out of logical order, which is a bit annoying, but not the end of the world. With this feature, the bug would be much more glaring.

@mpkopec
Copy link
Author

mpkopec commented Oct 30, 2022

Indeed, the ordering problem seems to be hard to solve, I have recently found it using Acrobat Reader's "Comment summary" feature, which labels annotations on a page with numbers, then produces a page with the summary of each annotation just after said page. An example is in the attachments. The example is very simple and does not show the problem you mentioned, but with a more densely annotated PDF I recently worked with, it really showed. The order of the comments indeed wasn't matching exactly the reading direction. Acrobat resolves this by labelling each annotation.

The question is: is it feasible (in your opinion) to work on a feature, which would produce 2 output files: a PDF with annotations labelled and the markdown comment summary?

TeXSample_annotated.pdf
TeXSample_annotated_summary.pdf

@mpkopec
Copy link
Author

mpkopec commented Oct 30, 2022

Additionally, since I can code some Python if you could roughly point me to a part of the code needing enhancement, I probably could be able to help and treat this feature, as a kind of weekend project. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants