Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rect value for some PDF form fields doesn't contain floating point numbers but rather strings like #Obj#32_0 #738

Open
prescriptionlifeline opened this issue Oct 2, 2024 · 0 comments
Labels

Comments

@prescriptionlifeline
Copy link

prescriptionlifeline commented Oct 2, 2024

  • PHP Version: 8.1.2-1ubuntu2.18
  • PDFParser Version: 2.11.0

Description:

The attached PDF contains 3x form fields. The Rect key for one of those form fields contains 4x floating point numbers, presumably representing the coordinates of that form field on the PDF, however, the other two form fields contain strings - #Obj#32_0 - #Obj#35_0. idk what these strings means and if there's a way to get coordinates from those strings it's unclear to me what that method might be.

PDF input

test.pdf

Expected output

Array
(
    [P] => Array
        (
            [Type] => Page
            [Rotate] => 0
        )

    [T] => incomeName1
    [Rect] => Array
        (
            [0] => 205.35
            [1] => 717.601
            [2] => 540.298
            [3] => 740.74
        )

    [F] => 4
    [Type] => Annot
    [Subtype] => Widget
    [DA] => /Helv 12 Tf 0 g
    [MK] => Array
        (
        )

    [FT] => Tx
)

Actual output

Array
(
    [P] => Array
        (
            [Type] => Page
            [Rotate] => 0
        )

    [T] => PatientsIncomeSS
    [Rect] => Array
        (
            [0] => #Obj#32_0
            [1] => #Obj#33_0
            [2] => #Obj#34_0
            [3] => #Obj#35_0
        )

    [F] => 4
    [Type] => Annot
    [Subtype] => Widget
    [DA] => /Helv 12 Tf 0 g
    [MK] => Array
        (
        )

    [FT] => Tx
)

Code

<?php
include('vendor/autoload.php');

$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile('test.pdf');

$objects = $pdf->getObjects();
foreach ($objects as $obj) {
    print_r($obj->getDetails());
}
@k00ni k00ni added the bug label Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants