Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less panics, add error handling, add tests, re-export lopdf, linting, readme #48

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

joepio
Copy link
Contributor

@joepio joepio commented Feb 11, 2023

Hi there! This PR turned out a little larger than I anticipated, but I think it bings some important improvements:

This PR also makes some other open PRs unnecessary: #25 #45 #44, also has overlap with #49

It still requires some work though. @jrmuizel could you help out to fix the last of the types?

Some open questions / to still be improved upon:

  • There are still quite a bit of panics! This PR doesn't fix them all. (maybe I'll come back later to do that though)
  • from_opt_obj has many panics still, in particular
  • I feel like splitting up lib.rs into a bunch of modules could help make the project a bit more easy to navigate.
  • Some functions are deeply nested. Perhaps some could be split up. I'm not entirely sure about what they do, though, so I didn't really touch that.
  • Many fields from structs are never used. Should these be removed? E.g. Type0Func, Type2Func, CalGray, CalRGB. The latter ones are public - not sure if these are used anywhere outside? Their fields are unused in the repo.

@joepio joepio changed the title Errors over panics Less panics, add error handling, add tests, re-export lopdf, linting Feb 12, 2023
@joepio joepio changed the title Less panics, add error handling, add tests, re-export lopdf, linting Less panics, add error handling, add tests, re-export lopdf, linting, readme Feb 12, 2023
@joepio
Copy link
Contributor Author

joepio commented Feb 19, 2023

@jrmuizel Could you take a look at my PR? Let me know if anything requires changing. Also, if you'd like help with maintenance of this repo, let me know.

@jrmuizel
Copy link
Owner

I'll try to take a look this week

@joepio
Copy link
Contributor Author

joepio commented Feb 27, 2023

@jrmuizel I've made some additional changes this weekend. Let me know if there's anything I can do to help get this merged.

@jrmuizel
Copy link
Owner

jrmuizel commented Mar 4, 2023

Thanks for your contribution. I finally had a chance to look at this and there's a lot here. I don't have a lot of time for this project so it would be easier for me if you broke it up into individual PRs that I can review and merge incrementally instead of one big one.

A couple of things:

I'm not a huge fan of the mass unwrap removal. Without a fix for rust-lang/rust#54144 it's a lot more painful to debug a propagated error instead of a panic. I wouldn't really recommend running this software in production but if you really want to perhaps you can use catch_unwind. If you pdfs that trigger particular panics I'm happy to fix those on a case by case basis.

Regarding testing, I don't really want to bloat the repo with large test PDFs. Perhaps a good option would be to do what pdf.js does and use urls in pdf.link files (https://github.com/mozilla/pdf.js/tree/master/test/pdfs).

I'd also prefer the changes in the clippy commit be broken out into individual fixes.

Sorry if these requests come across as a bit particular. If you feel differently about them or don't want to spend more time on these changes I totally understand you keeping your own fork. In the mean time I'll continue going through your changes trying to pull out the parts that look good to me.

@joepio
Copy link
Contributor Author

joepio commented Mar 5, 2023

@jrmuizel Thanks for the thorough response! And good to see you've already merged the readme change.

Without a fix for rust-lang/rust#54144 it's a lot more painful to debug a propagated error instead of a panic

Not sure if I understand this correctly, but I think there are some other approaches that could give you more context while debugging. For example, the anyhow crate provides stack traces within normal Results, using RUST_BACKTRACE=1 and the backtrace feature. Or you could create a custom error, and set a breakpoint in the code related to instantiating that custom error. I really think keeping the .unwrap is not the right call for a published library.

I'd also prefer the changes in the clippy commit be broken out into individual fixes.

Well, that's the first commit. You could cherry pick that one perhaps?

Regarding testing, I don't really want to bloat the repo with large test PDFs. Perhaps a good option would be to do what pdf.js does and use urls in pdf.link files (https://github.com/mozilla/pdf.js/tree/master/test/pdfs).

If we re-fetch all pdf files every test run, testing becomes thousands of times slower, which harms productivity. However, we could store them in a local directory that is in .gitignore.

If you feel differently about them or don't want to spend more time on these changes I totally understand you keeping your own fork

Well, it all boils down to the error handling. If you really want to keep the panics, I think I'll work on the fork.

@joepio joepio mentioned this pull request Mar 5, 2023
@0xpr03
Copy link

0xpr03 commented Mar 18, 2023

Without a fix for rust-lang/rust#54144

I'd recommend using something like thiserror, which has support for stabilized Backtraces, while allowing for specific error kinds. So you can just include a backtrace for every error. Otherwise you may use something like the SNAFU crate, which allows backtrace under specific flags, so you can restrict them to debug builds.

I wouldn't really recommend running this software in production but if you really want to perhaps you can use catch_unwind

I strongly recommend against using catch_unwind as some kind of try-catch method for general exceptions. Panics are supposed to be unfixable situations, a PDF parser panicking my complete application by reading some sort of weird PDF is definitely an unhappy surprise (and a major DoS attack path). Even for hobby projects I personally avoid any libraries that do this.

You're definitely free to do otherwise, just voicing my 2 cents.

@HarrisDePerceptron
Copy link

this is such a stand thing to do in a library these days. panics really make this unusable.
anyhow + thiserro + ? would be great.

@wdoppenberg
Copy link

Any word on this? I would love to have proper error propagation. Currently having to do catch_unwind and suppressing println statements for Unicode mismatch is not very idiomatic Rust I'd say.

@Christof23
Copy link

@joepio @HarrisDePerceptron @0xpr03 couldn't agree more on the need for stable libraries and on the use of anyhow + thiserror. As others have stated I'm also hesitant to incorporate this lib into production code if it will end up crashing somewhere down the line.

Is there any update @jrmuizel on whether this can be integrated soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove some println! to be more CLI / TUI friendly Panic at FirstChar
6 participants