Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot read property 'compressed' of undefined #265

Open
mojoaxel opened this issue Aug 1, 2021 · 5 comments
Open

Cannot read property 'compressed' of undefined #265

mojoaxel opened this issue Aug 1, 2021 · 5 comments

Comments

@mojoaxel
Copy link

mojoaxel commented Aug 1, 2021

We ran into a pdfjs (v2.4.5) problem: nbesli/pdf-merger-js#42

The following code-snipped...

const doc = new pdf.Document()
const src = await fs.readFile(path.join(FIXTURES_DIR, 'issue-42.pdf'))
const ext = new pdf.ExternalDocument(src)
doc.addPagesOf(ext)
const fileBuffer = await doc.asBuffer()
await fs.writeFile(path.join(TMP_DIR, 'Testfile_issue-42.pdf'), fileBuffer)

...results in this error:

TypeError: Cannot read property 'compressed' of undefined

      at parseObject (node_modules/pdfjs/lib/object/reference.js:81:15)
      at PDFReference.get [as object] (node_modules/pdfjs/lib/object/reference.js:15:17)
      at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:68:35)
      at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:84:18)
      at Function.addObjectsRecursive (node_modules/pdfjs/lib/parser/parser.js:75:16)
      at ExternalDocument.write (node_modules/pdfjs/lib/external.js:62:14)

Please find the problematic PDF file attached:
issue-42.pdf

@rkusa
Copy link
Owner

rkusa commented Aug 18, 2021

Thanks for the report! I looked into it and the cause of the issue seems to be that pdfjs does not support hybrid-reference files. More specifically, the support for the XRefStm property of the trailer is not yet implemented. While it successfully falls back to the normale xref table (instead of the xref stream), the normal xref table is missing the object with the ID 46, which is thus unknown and causes the error you've posted.

Possible solutions:

  • Implement support for XRefStm
  • Silently ignore missing objects (I am not sure if I'd like this solution though)

I don't have the time right now to implement it, but I'll keep it in the back of my mind.

@hobgoblina
Copy link

Any suggested temporary fixes that we might be able to use to circumvent this error while the issue is waiting to be resolved?

@cah-andy-kim
Copy link

Can you check if the PDF is hybrid reference or not? I'm currently having this problem, and I want to prevent the pdf merge if there's a way to check for that.

@shu512
Copy link

shu512 commented Nov 23, 2021

Hi everyone!
I have a small solution, but it will not suit everyone. And we need to use node-pdftk

import pdf from 'pdfjs';
import fs from 'fs';
import pdftk from 'node-pdftk';

const src = await pdftk.input('issue-42.pdf').output(); //
const doc = new pdf.Document();
const ext = new pdf.ExternalDocument(src);
doc.addPagesOf(ext);
const fileBuffer = await doc.asBuffer();
fs.writeFileSync('Testfile_issue-42.pdf', fileBuffer);

Looks like node-pdftk extracts xref table from a xref stream (It means that a file will weigh more). So, pdfjs can work with it.
Testfile_issue-42.pdf looks the same after launching the code above. But links now it's just a text.

@sjd2021
Copy link

sjd2021 commented Jul 19, 2022

Running into this now.. Any actual fixes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants