Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty italic tag in clues causes rest of clue to be italicized #171

Open
jpd236 opened this issue Nov 7, 2021 · 3 comments
Open

Empty italic tag in clues causes rest of clue to be italicized #171

jpd236 opened this issue Nov 7, 2021 · 3 comments

Comments

@jpd236
Copy link
Contributor

jpd236 commented Nov 7, 2021

The following XML in a JPZ clue:

<span>Part of clue<i/>more of the clue</span>

Causes "more of the clue" to be italicized even though it's not actually enclosed in the italics. This also seems to happen with a regular empty tag instead of a self-closing tag, and also if there's whitespace inside the tag; there has to be some non-blank character for the parsing to work correctly, AFAICT.

@jpd236
Copy link
Contributor Author

jpd236 commented Jan 2, 2022

  • I can no longer reproduce with a regular empty tag, only a self-closing tag, at least with ipuz (was testing with jpz before)
  • It still reproduces if I use wxHtmlWinParser in HtmlClueListBox::CacheItem to parse the HTML, which I think is the component responsible for parsing the HTML before rendering it
  • AFAICT, <i/> is invalid HTML in that only certain tags are permitted to be self-closing. It's also not really the kind of thing you'd generally expect to see, though I did observe it once (not sure whether it was in the original source data or if I introduced it when converting from another format to JPZ). But the failure mode here of just ignoring the closing tag doesn't seem great.

Probably not a huge priority in the grand scheme of things, but I guess the next step here would be to try to reproduce this with a smaller sample app and pass the report along to wxWidgets.

EDIT: I originally posted this without escaping the <i/> above, and, funnily enough, the rest of the comment showed up in italics! Maybe this is actually how HTML parsers are supposed to handle this...

@mrichards42
Copy link
Owner

Hmm . . . looking through what I think is the jpz schema, it seems like clue text is actually XML, not a string of html? In which case <i/> would in fact be a self-closing tag :) It looks like the spec allows <i> <b> <span> <sub> and <sup> children in clue text.

So . . . maybe this needs to be handled in the jpz parser? We could convert self-closing tags to the equivalent empty tag, or perhaps just remove them entirely since that should render the same way.

@jpd236
Copy link
Contributor Author

jpd236 commented Mar 6, 2022

I realized that I filed this as part of investigating the clue mentioned in jpd236/kotwords#24, and indeed that specific clue is still a working repro case where the italic tag is non-empty (and thus not self-closing). So it does seem like there's more to this.

Attached a sample JPZ where the clue for 1-Across is:

<span>First across clue with</span><i> </i><span>italicized space</span>

This renders correctly in Crossword Solver, but in XWord, the space is omitted, and "italicized space" is in italics.

test.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants