fix: pdf conversion for long lines #17

ublefo · 2024-03-22T16:13:59Z

This fixes the PDF processing issue when there are super long lines in the submitted code files by using cut and fold from coreutils to wrap lines. Additionally, introduce https://github.com/ruby/shellwords for proper shell escape, and pdf-reader for extracting text from PDF files.

There are two limits configured, the hard limit is 1000 characters, and the soft limit is 160 characters. Any lines over 1000 characters will be truncated, and then line breaks will be applied for any lines over 160 characters long.

The student submitted files are never modified, we write the processed results into temporary files, and the latex template will use those for rendering instead. The temp files are then cleaned up after rendering. If the rendered file has been modified, a warning will be added into the PDF document to indicate the rendered file differs from the original submission.

A unit test and a test file have been added to test this specific feature.

Supersedes #8

Generated result:

maddernd · 2024-03-28T02:03:53Z

Can we update this PR so it is only changing files related to this fix.

ublefo · 2024-04-21T22:53:28Z

This is ready for review now

PDF processing will fail with "! Dimension too large." if a line of text is way too long. Implement a simple processing helper method to call fold (from coreutils) to fold long lines for all code files.

We shouldn't modify student submissions, instead we use the temp file when rendering them to PDF. Cleanup will be performed after rendering is complete.

Run fold on provided files and compare the output with diff. If the file doesn't contain any lines that are over the configured threshold, it will be identical to the original. In this case we replace the temp file with a symlink for easy identification in the template.

Redirect stdout to /dev/null since we don't need the diff output

Set a hard limit of 1000 characters, and truncate everything in the same line after the limit is reached. Otherwise we could get PDF files with hundreds of pages which is completely unreadable.

ublefo · 2024-04-27T14:34:30Z

Closed in favor of doubtfire-lms#439

ublefo force-pushed the pdf-long-lines branch 2 times, most recently from 27f120b to c13c1f6 Compare March 27, 2024 17:28

ublefo force-pushed the development branch from 80163e4 to 18136c2 Compare April 15, 2024 09:00

ublefo marked this pull request as draft April 16, 2024 00:48

ublefo force-pushed the pdf-long-lines branch 3 times, most recently from f406f6c to e776598 Compare April 19, 2024 11:38

ublefo marked this pull request as ready for review April 20, 2024 03:03

ublefo requested a review from macite April 21, 2024 22:53

ublefo added 12 commits April 28, 2024 00:30

chore: add shellwords gem for proper shell path escape

5bf5857

fix: process code files to remove long lines

d82c3f8

PDF processing will fail with "! Dimension too large." if a line of text is way too long. Implement a simple processing helper method to call fold (from coreutils) to fold long lines for all code files.

refactor: write line-wrapped code files into temp files

15edb68

We shouldn't modify student submissions, instead we use the temp file when rendering them to PDF. Cleanup will be performed after rendering is complete.

fix: silence diff

a538888

Redirect stdout to /dev/null since we don't need the diff output

fix: change default column width limit to 160 characters

3f134b5

enhance: add a notice in the pdf template if file is modified

a802053

enhance: truncate super long lines in code files rendered in pdf

3e6bd55

Set a hard limit of 1000 characters, and truncate everything in the same line after the limit is reached. Otherwise we could get PDF files with hundreds of pages which is completely unreadable.

refactor: update temp file names

b12b2f9

quality: add unit test for code submissions with long lines

cf0200f

quality: ensure long line notice is not included when not applicable

5ad8376

fix: update comment to reflect temp filename changes

bb930e4

ublefo force-pushed the pdf-long-lines branch from d6df870 to bb930e4 Compare April 27, 2024 14:32

ublefo closed this Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pdf conversion for long lines #17

fix: pdf conversion for long lines #17

ublefo commented Mar 22, 2024 •

edited

Loading

maddernd commented Mar 28, 2024

ublefo commented Apr 21, 2024

ublefo commented Apr 27, 2024

fix: pdf conversion for long lines #17

fix: pdf conversion for long lines #17

Conversation

ublefo commented Mar 22, 2024 • edited Loading

maddernd commented Mar 28, 2024

ublefo commented Apr 21, 2024

ublefo commented Apr 27, 2024

ublefo commented Mar 22, 2024 •

edited

Loading