Skip to content

Commit

Permalink
ebook fix: remove stray markup from the text (#172)
Browse files Browse the repository at this point in the history
Before this commit, chapter 23 of the ebook had some spurious markup
that found its way into the readable text:

    And Harry brought out the original parchment with the hypotheses,
    and began scribbling.

    plus .5minus 1

    Observation:

    Wizardry isn’t as powerful now as it was when Hogwarts was founded.

    plus .5minus 1

You can see the source of this markup in the comment immediately above
the line changed in this commit:

    # \vskip 1\baselineskip plus .5\textheight minus 1\baselineskip

The problem is this regex:

    "\\vskip .*?\\baselineskip"

In the example above, it matches `"\vskip 1\baselineskip"`, stopping at
the first `"\baselineskip"` instead of the one at the end of the line.
This leaves the errant bit of markup in the text:

    plus .5\textheight minus 1\baselineskip

Which, because it does not start with a backslash, ends up inserted into
the content.

The fix is simple: remove the `'?'`, turning the `".*"` into a greedy
match instead of a minimal match. This matches to the end of the last
`"\baselineskip"`, completely removing this bit of markup from the text
as intended.
  • Loading branch information
jeremyschlatter authored Aug 27, 2024
1 parent 9ecaeba commit 9c0b22c
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion scripts/ebook/step_3.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@
cont = re.sub(r"\\clearpage(\{\}|)\n?", "", cont)

# \vskip 1\baselineskip plus .5\textheight minus 1\baselineskip
cont = re.sub(r"\\vskip .*?\\baselineskip", "", cont)
cont = re.sub(r"\\vskip .*\\baselineskip", "", cont)

# remove \settowidth{\versewidth}... \begin{verse}[\versewidth]
cont = re.sub(
Expand Down

0 comments on commit 9c0b22c

Please sign in to comment.