Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PdfReader.Open fails with error "Token '60' was not expected #211

Open
mikethea1 opened this issue Nov 19, 2024 · 4 comments
Open

PdfReader.Open fails with error "Token '60' was not expected #211

mikethea1 opened this issue Nov 19, 2024 · 4 comments
Labels
enhancement New feature or request investigation Under investigation

Comments

@mikethea1
Copy link

Reporting an Issue Here

Expected Behavior

The PDF opens without error, or with an exception that indicates that this is something PdfSharp does not support.

The PDF

Actual Behavior

PdfSharp.Pdf.IO.PdfReaderException: Token '60' was not expected.
   at PdfSharp.Internal.ParserDiagnostics.ThrowParserException(String message) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Internal\Diagnostics.cs:line 61
   at PdfSharp.Pdf.IO.Parser.ReadObjectInternal(PdfObject pdfObject, PdfObjectID objectID, Boolean includeReferences, Boolean fromObjectStream, SuppressExceptions suppressObjectOrderExceptions) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\Parser.cs:line 337
   at PdfSharp.Pdf.IO.Parser.ReadIndirectObject(PdfReference pdfReference, SuppressExceptions suppressObjectOrderExceptions, Boolean withoutDecrypting) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\Parser.cs:line 932
   at PdfSharp.Pdf.IO.Parser.ReadAllIndirectObjects() in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\Parser.cs:line 1020
   at PdfSharp.Pdf.IO.PdfReader.OpenFromStream(Stream stream, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider, PdfReaderOptions options) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\PdfReader.cs:line 379
   at PdfSharp.Pdf.IO.PdfReader.OpenFromFile(String path, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\PdfReader.cs:line 251
   at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openMode, PdfPasswordProvider passwordProvider, PdfReaderOptions options) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\PdfReader.cs:line 189
   at PdfSharp.Pdf.IO.PdfReader.Open(String path, PdfDocumentOpenMode openMode, PdfReaderOptions options) in D:\THHO\Repos\PDFsharp\src\foundation\src\PDFsharp\src\PdfSharp\Pdf.IO\PdfReader.cs:line 166

Steps to Reproduce the Behavior

If there's interest in looking into this I can share the file privately via the mechanism described here: https://github.com/empira/PDFsharp.IssueSubmissionTemplate

Based on #207 it isn't clear to me what is considered a potential bug vs something PdfSharp deliberately doesn't support, so apologies in advance if this behavior is expected. Hopefully this can be a quick close-won't fix in that case and the issue can serve as documentation for others who encounter this.

@TH-Soft
Copy link

TH-Soft commented Nov 19, 2024

The file is corrupted and this is not a bug in PDFsharp. In issue 207, the file is also corrupted, but PDFsharp has been updated to correct the wrong information in the PDF and read the file nevertheless.

Without PDF file, we cannot investigate what's going on. Depending on what is wrong with the PDF, there may be a way to modify PDFsharp to read it anyway.

@TH-Soft TH-Soft added the Cannot Reproduce https://xkcd.com/583/ label Nov 19, 2024
@mikethea1
Copy link
Author

Thanks @TH-Soft I hadn't realized that #207 resulted in an update!

I've emailed the PDF behind this issue as well as another error I encountered to the email address mentioned on https://github.com/empira/PDFsharp.IssueSubmissionTemplate. Hopefully that helps.

For my understanding, what is the best way to engage on these parsing issues in a helpful way?

I completely understand the perspective that not being able to handle a corrupted file doesn't indicate a library bug, but given the absolute lawlessness of PDFs encountered out in the wild having libraries that are similarly robust to the PDF viewers customers are used to (e.g. in Chrome) is certainly handy.

I'm happy to send over weird files I encounter to help the library improve, but at the same time I don't want to bother the maintainers with yet more instances of errors you've seen before and perhaps decided explicitly not to accommodate.

@ThomasHoevel
Copy link
Member

For my understanding, what is the best way to engage on these parsing issues in a helpful way?

Nothing we can do without the PDF file. Attach it on GitHub if not confidential, mail it if confidential.

I can replicate the "Token '60' was not expected." issue. There is nothing wrong with the "60", but before the line beginning with "60" there should be a line "endobj". So, the file is corrupted.
Maybe we can change PDFsharp to just ignore the missing "endobj" and read the file anyway.

The other file you sent opens with the current internal build of PDFsharp without error messages.

@ThomasHoevel ThomasHoevel added enhancement New feature or request investigation Under investigation and removed Cannot Reproduce https://xkcd.com/583/ labels Nov 20, 2024
@mikethea1
Copy link
Author

mikethea1 commented Nov 21, 2024

Thanks for investigating @ThomasHoevel !

I can replicate the "Token '60' was not expected." issue. There is nothing wrong with the "60", but before the line beginning with "60" there should be a line "endobj". So, the file is corrupted.
Maybe we can change PDFsharp to just ignore the missing "endobj" and read the file anyway.

This would be nice behavior to have.

The other file you sent opens with the current internal build of PDFsharp without error messages.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request investigation Under investigation
Projects
None yet
Development

No branches or pull requests

3 participants