Skip to content

Commit

Permalink
backport fix for #2430 remove null character enclosed by XML tag when…
Browse files Browse the repository at this point in the history
… sanitizing text
  • Loading branch information
mojavelinux committed Jun 23, 2023
1 parent f1a2cda commit f851709
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ For a detailed view of what has changed, refer to the {url-repo}/commits/main[co

Bug Fixes::

* remove null character enclosed in XML tag when santizing text; fixes invisible text in outline when heading contains index term (#2430)
* alias `File.exists?` to `File.exist?` when loading RGhost optimizer to patch incompatibility when using Ruby 3.2

Build / Infrastructure::
Expand Down
4 changes: 2 additions & 2 deletions lib/asciidoctor/pdf/sanitizer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ module Sanitizer
'nbsp' => ' ',
'quot' => '"',
}).default = '?'
SanitizeXMLRx = /<[^>]+>/
SanitizeXMLRx = /<[^>]+>\0?/
CharRefRx = /&(?:amp;)?(?:([a-z][a-z]+\d{0,2})|#(?:(\d\d\d{0,4})|x(\h\h\h{0,3})));/
UnescapedAmpersandRx = /&(?!(?:[a-z][a-z]+\d{0,2}|#(?:\d\d\d{0,4}|x\h\h\h{0,3}));)/

# Strip leading, trailing and repeating whitespace, remove XML tags and
# Strip leading, trailing and repeating whitespace, remove XML tags along with an enclosed null character, and
# resolve all entities in the specified string.
#
# FIXME: move to a module so we can mix it in elsewhere
Expand Down
8 changes: 6 additions & 2 deletions spec/outline_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -749,14 +749,18 @@
pdf = to_pdf <<~'EOS'
= _Document_ *Title*
:doctype: book
:sectnums:
== _First_ *Chapter*
== ((Wetland Birds))
EOS

outline = extract_outline pdf
(expect outline).to have_size 2
(expect outline).to have_size 3
(expect outline[0][:title]).to eql 'Document Title'
(expect outline[1][:title]).to eql 'First Chapter'
(expect outline[1][:title]).to eql 'Chapter 1. First Chapter'
(expect outline[2][:title]).to eql 'Chapter 2. Wetland Birds'
end

it 'should decode character references in entries' do
Expand Down

0 comments on commit f851709

Please sign in to comment.