Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: pdf conversion for long lines #17

Closed
wants to merge 12 commits into from

Conversation

ublefo
Copy link
Collaborator

@ublefo ublefo commented Mar 22, 2024

This fixes the PDF processing issue when there are super long lines in the submitted code files by using cut and fold from coreutils to wrap lines. Additionally, introduce https://github.com/ruby/shellwords for proper shell escape, and pdf-reader for extracting text from PDF files.

There are two limits configured, the hard limit is 1000 characters, and the soft limit is 160 characters. Any lines over 1000 characters will be truncated, and then line breaks will be applied for any lines over 160 characters long.

The student submitted files are never modified, we write the processed results into temporary files, and the latex template will use those for rendering instead. The temp files are then cleaned up after rendering. If the rendered file has been modified, a warning will be added into the PDF document to indicate the rendered file differs from the original submission.

A unit test and a test file have been added to test this specific feature.

Supersedes #8

Generated result:
image

@ublefo ublefo force-pushed the pdf-long-lines branch 2 times, most recently from 27f120b to c13c1f6 Compare March 27, 2024 17:28
@maddernd
Copy link

Can we update this PR so it is only changing files related to this fix.

@ublefo ublefo marked this pull request as draft April 16, 2024 00:48
@ublefo ublefo force-pushed the pdf-long-lines branch 3 times, most recently from f406f6c to e776598 Compare April 19, 2024 11:38
@ublefo ublefo marked this pull request as ready for review April 20, 2024 03:03
@ublefo ublefo requested a review from macite April 21, 2024 22:53
@ublefo
Copy link
Collaborator Author

ublefo commented Apr 21, 2024

This is ready for review now

ublefo added 12 commits April 28, 2024 00:30
PDF processing will fail with "! Dimension too large." if a line of text
is way too long. Implement a simple processing helper method to call
fold (from coreutils) to fold long lines for all code files.
We shouldn't modify student submissions, instead we use the
temp file when rendering them to PDF. Cleanup will be performed
after rendering is complete.
Run fold on provided files and compare the output with diff. If the file
doesn't contain any lines that are over the configured threshold, it
will be identical to the original. In this case we replace the temp file
with a symlink for easy identification in the template.
Redirect stdout to /dev/null since we don't need the diff output
Set a hard limit of 1000 characters, and truncate everything in the same
line after the limit is reached. Otherwise we could get PDF files with
hundreds of pages which is completely unreadable.
@ublefo
Copy link
Collaborator Author

ublefo commented Apr 27, 2024

Closed in favor of doubtfire-lms#439

@ublefo ublefo closed this Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants