Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the <slot> tag #848

Merged
merged 3 commits into from
Nov 21, 2020
Merged

Support the <slot> tag #848

merged 3 commits into from
Nov 21, 2020

Conversation

lhchavez
Copy link
Contributor

was added to the WHATWG HTML5 Living Standard on April 20, 2016.

<slot> was added to the WHATWG HTML5 Living Standard on April 20, 2016.
@geoffmcl
Copy link
Contributor

@lhchavez thanks for the Feature Request...

I pulled your fork, built a 5.7.28.I848 version... created some sample html... it seems great on some things... but puzzled by a few things...

I had added <style> element, in the template, and found that I needed --fix-style-tags no to the config...

Then still got a warning - moved <style> tag to <head>! fix-style-tags: no to avoid. - hmmm, I said I understood.. why a continued nag...

Then from the code -

  <div>
    <span slot='title-1'>Title A</span> 
    <span slot='title-2'>Title B...
   ...

got a warning - <span> proprietary attribute "slot" - but maybe this requires an addition to the span tag....

On another sample -

<p><slot name="my-text">My default text</slot></p>

Tidy outputs -

<p></p>
<slot name="my-text">My default text</slot>
<p></p>

Just thinking aloud, should slot have an inline bit, too,... but maybe the code is wrong in other ways... the nu html checker reports - Error: Element slot not allowed as child of element p in this context. - but maybe this is a different issue... still deciding...

With those niggles discussed, solved, added... looks like a good PR... thanks...

To continue testing, could you supply some, hopefully small sample html to test, explore, etc... thanks

lhchavez added a commit to lhchavez/tidy-html5-tests that referenced this pull request Nov 29, 2019
This change contains the testcase needed for
htacg/tidy-html5#848
Given that the <slot> tag by itself is not too useful, this commit also
introduces support for the global slot attribute.
@lhchavez
Copy link
Contributor Author

Done, added one testcase that has some snippets pulled from https://developer.mozilla.org/en-US/docs/Web/HTML/Element/slot and should now pass cleanly.

@lhchavez
Copy link
Contributor Author

the one thing i could not find an easy way to do is to allow both <template> and <slot> to include any arbitrary HTML content (since they both have a transparent content model). that's what causing the move of the <style> to <head>.

I'm using ParseInline for <slot> in this case, but that's not the correct choice, since the other example in MDN (<slot name="attributes"><p>None</p></slot>) does not work. ParseBlock does some other weird things since the code does not expect it to be called from within an inline context.

@geoffmcl
Copy link
Contributor

@lhchavez thank you for the additional pushes to this PR, but not sure I agree with all, yet...

But pulled and built tidy-rc848-1 for continued testing...

And thanks for the case-848.html test sample, from MDN docs...

There, in PR-32, suggest a small enhancement, to remove some nu nags -

diff --git a/cases/testbase/case-848.html b/cases/testbase/case-848.html
index d74c3ca..2ff2bf9 100644
--- a/cases/testbase/case-848.html
+++ b/cases/testbase/case-848.html
@@ -1,7 +1,8 @@
 <!DOCTYPE html>
-<html>
+<html lang="en">
 <head>
-  <title>test</title>
+<meta charset="utf-8">
+<title>add 'slot' tag</title>
 </head>
 <body>
   <p><slot name="my-text">My default text</slot></p>

But these are not particularly important...

Question on <p><slot name="my-text">My default text</slot></p>?

I do want to understand why nu issues the ERROR: Element slot not allowed as child of element p in this context. ... for this case-848.html source... while libtidy thinks it is ok... who is right, wrong, maybe,.. here...

I have a test file in_848.html, which satisfies nu, and tidy-rc848-1, which contains code like <th><slot name='title-1'></slot></th>... just trying to understand...

Yes, strongly note you have switched <slot> from CM_BLOCK/ParseBlock to CM_INLINE/ParseInline... but still have some doubts about that... others have both bits... and use ParseBlock... and other combinations... what is most correct for <slot>... do not know yet...

And that raises the question about <slot name="my-text"><p>My default text</p></slot>, which I think would be valid in a template... need to try more examples/samples... help on those appreciated...

Using definitions by MDN, or even by W3C, for that matter, does little to understand the problem in the libtidy context... which started before such definitions came into style... but maybe they help describe the tag/attr... and thus help where to look...

I started to look back in the issues related to template, and found some - #199, #282, #310, ... and I am sure there are others... presently trying to wade through those...

Feel the PR has advanced, but still some definition questions, trial, tests, ...

Look forward to further feedback, comment, patches, PR, etc on this... thanks...

@lhchavez
Copy link
Contributor Author

lhchavez commented Dec 1, 2019

I do want to understand why nu issues the ERROR: Element slot not allowed as child of element p in this context.

This might be a bug in nu, since it says that

Contexts in which element slot may be used:

Where phrasing content is expected.

Content model for element p:

Phrasing content.

Which sounds like it's contradicting itself.

And that raises the question about <slot name="my-text"><p>My default text</p></slot>, which I think would be valid in a template...

yes, that should be valid in a template, but there is no ready-to-use solution to parse arbitrary HTML that should be allowed in <template>. Both ParseBlock and ParseInline are wrong in this case, but ParseBlock triggers fewer edge cases after manual inspection. I can definitely create a ParseTransparent that just calls ParseBlock and has a TODO comment explaining that it's a subpar solution that needs more work, linking to an issue about it.

Re: CM_INLINE vs CM_BLOCK/CM_INLINE|CM_BLOCK|CM_MIXED, I think that for <slot>, CM_INLINE is the better choice, since <slot> is only allowed in places where phrasing content is expected (per the error from nu above and MDN). Tidy's model does not match exactly, but the the vast majority of the tags in the Phrasing content category are marked as only CM_INLINE in Tidy.

@geoffmcl
Copy link
Contributor

geoffmcl commented Dec 2, 2019

@lhchavez thank you for the feedback...

It seems we are 90% there, only differing on the slot def_tags definition...

You suggest CM_INLINE in 'libtidycan be **equated** toPhrasing content` in MDN...no, no, no...

I can not find slot in their list... lost... but that is not the point... that is MDN...

Presently libtidy has NO concept of Phrasing content... maybe it could be added... somehow... if it can be defined...

Both ParseBlock and ParseInline are wrong in this case

You further suggest a ParseTransparent, based on ParseBlock, with some TODO notes, but at the moment the PR uses ParseInline... an important choice...

Is what you suggest based on performance on edge cases... that smacks of trial and error... all for that... yeah ;=))

What happens

What happens if the flag is say, CM_INLINE|CM_BLOCK[|CM_MIXED], and the parser ParseBlock is used...

Or some other combinations...

How do they perform in the main, AND edge cases... not sure what that is...

Started an analysis of the current tag_defs table... to see if that would yield some categorization of the entries... especially. inline&|block with parser... but that is SLOW going...

RE: template... no ready-to-use solution to parse arbitrary HTML ...

Here, I tried to point to some earlier - turn off tidy - discussion... in general, it seems the consensis is - This HTML Tidy - what do you want it to do?

There are many other text manipulation tools, to avoid libTidy seeing this funny html code, or mess... to tidy...

Parsing by libTidy is OFF, well, mostly OFF, when, parsing tags like <script>...

Is it suggested that the <template> tag be treated as such... refs for that...

But that seems a different, and NEW issue... must refer to the previous discussions... thanks...

This might be a bug in nu, since it says that

Maybe, but maybe we do not understand, fully, those cute definitions of Phrasing content, et al... this time in nu... in a libTidy context...

More importantly, how that can that be simulated in libTidy... if it is required...

But back to this great PR -

I would like to see some more testing of other table options... before deciding which is the best... will continue that... options/example/results testing appreciated...

Look forward to further feedback, comment, patches, PR, etc on this... thanks...

@geoffmcl
Copy link
Contributor

@lhchavez I have now conducted lots of further test, and concluded slot should use ParseBlock, in libTidy terms, with the bit flags of inline|block|mixed, and thus suggest the following additional patch to this PR -

diff --git a/src/tags.c b/src/tags.c
index 9f72a70..760c37b 100644
--- a/src/tags.c
+++ b/src/tags.c
@@ -323,7 +323,7 @@ static Dict tag_defs[] =
   { TidyTag_OUTPUT,      "output",       VERS_ELEM_OUTPUT,      &TY_(W3CAttrsFor_OUTPUT)[0],      (CM_INLINE),                   TY_(ParseInline),    NULL           },
   { TidyTag_PROGRESS,    "progress",     VERS_ELEM_PROGRESS,    &TY_(W3CAttrsFor_PROGRESS)[0],    (CM_INLINE),                   TY_(ParseInline),    NULL           },
   { TidyTag_SECTION,     "section",      VERS_ELEM_SECTION,     &TY_(W3CAttrsFor_SECTION)[0],     (CM_BLOCK),                    TY_(ParseBlock),     NULL           },
-  { TidyTag_SLOT,        "slot",         VERS_ELEM_SLOT,        &TY_(W3CAttrsFor_SLOT)[0],        (CM_INLINE),                   TY_(ParseInline),    NULL           },
+  { TidyTag_SLOT,        "slot",         VERS_ELEM_SLOT,        &TY_(W3CAttrsFor_SLOT)[0],        (CM_INLINE|CM_BLOCK|CM_MIXED), TY_(ParseBlock),     NULL           },
   { TidyTag_SOURCE,      "source",       VERS_ELEM_SOURCE,      &TY_(W3CAttrsFor_SOURCE)[0],      (CM_BLOCK|CM_INLINE|CM_EMPTY), TY_(ParseBlock),     NULL           },
   { TidyTag_SUMMARY,     "summary",      VERS_ELEM_SUMMARY,     &TY_(W3CAttrsFor_SUMMARY)[0],     (CM_BLOCK),                    TY_(ParseInline),    NULL           },
   { TidyTag_TEMPLATE,    "template",     VERS_ELEM_TEMPLATE,    &TY_(W3CAttrsFor_TEMPLATE)[0],    (CM_BLOCK),                    TY_(ParseBlock),     NULL           },

At first I tried to reason it out, analyzing the current def_tags definition table, using a tidy-tags.pl script, which outputs a html table version...

See the temptable.htm using your slot branch, and temptable2.htm using my slot2 branch... WARNING These are only in my web tmp folder, and can be deleted/overwritten at any time... sorry...

But they do not seem to answer the question...

I then constructed more samples for testing... in addition to in_848.html, already published to my repo, added 848-1, 848-11, 848-12, 848-2, 848-3, and eventually 848-4... will try to get around to publishing these... but...

I built more versions of tidy... if we consider the table is made using 3 bit flags, inline (I), block (B), mixed (M), and 2 parsers, inline (I), and block, (B)... so built versions -

  • tidy-I-I.exe - your slot branch version - 1 inline flag, inline parser
  • tidy-IBM-B.exe - my slot2 branch - 3 flags, block parser
  • then others - B-B, block/block, IB-B, add inline, IBM-I... etc

Then used each version of tidy, on each of my samples... the best run is using my IBM-B version... your I-I fails, creates bad, incompatible, html output, on several samples...

Perhaps the most important is 848-4, which is based on the working, online sample, at MDN/web-components ... with the addition of a config -

// issue 848-4
// new-blocklevel-tags: element-details
custom-tags: blocklevel
fix-style-tags: no

It also re-open the addition of the fix-style-tags option, but that can be a separate issue...

So I am now convinced the above patch needs to be applied to this PR... then feel it is good to go... at least after maybe a few tweaks, time, more tests...

Look forward to further feedback, comment, patches, PR, etc on this... thanks...

This is still suboptimal since ParseBlock will make it so that <slot>
always expects "Flow content", whereas the spec says that it should
have a Transparent content model.

In practice, it should allow all the cases that the spec allows for, but
it will also allow some cases that the spec does not allow. Notably, if
a <slot> tag is found in a Phrasing content (an inline context in
libtidy lingo), it will wrongly let Flow content (block tags in libtidy
lingo), whereas it shouldn't. But all in all, it's a good compromise.
@lhchavez
Copy link
Contributor Author

You suggest CM_INLINE in libtidy can be equated toPhrasing content in MDN...no, no, no...

I can not find slot in their list... lost... but that is not the point... that is MDN...

welp, that was my bad. i should have linked the canonical W3C definition of Phrasing content, which does contain the <slot> tag. MDN is a great resource, but it can go out of sync every now and then.

And I'm sorry if I sounded like I was suggesting that CM_INLINE was equal to the Phrasing content concept. I realize they are slightly different, but it's the closest that is available at the moment. At least, it's very close conceptually IIUC.

Presently libtidy has NO concept of Phrasing content... maybe it could be added... somehow... if it can be defined...

IIUC the definifion is "if it is contained in the canonical Phrasing content tag list".

Both ParseBlock and ParseInline are wrong in this case

You further suggest a ParseTransparent, based on ParseBlock, with some TODO notes, but at the moment the PR uses ParseInline... an important choice...

Errr, sorry, I got a bit confused with the ParseBlock bit, since I was thinking about <template>. For <slot>, the W3C spec says that it has a Transparent content model, which means that it inherits the content model of the container tag.

Here's where it gets tricky. <slot> can only be found in places where Phrasing content is allowed, but since Phrasing content is itself Flow content, this is allowed by the spec:

<section>  <!-- this is Sectioning content, that has a Flow content model -->
  <slot>...</slot> <!-- this now has a Flow content model, so ParseBlock is more correct -->
  <p>  <!-- this is Flow content, that has a Phrasing content model -->
    <slot>...</slot> <!-- this now has a Phrasing content model, so ParseInline is more correct -->
  </p>
</section>

Is what you suggest based on performance on edge cases... that smacks of trial and error... all for that... yeah ;=))

yeah :/ as I mentioned, both options are suboptimal and break some cases, as outlined above.

What happens

What happens if the flag is say, CM_INLINE|CM_BLOCK[|CM_MIXED], and the parser ParseBlock is used...

Orginally I made the decision of using ParseInline on a less exhaustive list of examples, but it looks like ParseBlock should let all the cases that the spec allows at the expense of also allowing things that the spec does not allow. But I guess it's an okay tradeoff.

RE: template... no ready-to-use solution to parse arbitrary HTML ...

Here, I tried to point to some earlier - turn off tidy - discussion... in general, it seems the consensis is - This HTML Tidy - what do you want it to do?

What I'm most interested in is the automatic rewriting feature of tidy, which we would lose by turning it off :(

@geoffmcl
Copy link
Contributor

geoffmcl commented Jan 2, 2020

@lhchavez thank you for the additional commit 99b6c660 ... this looks good...

Have not yet had a chance to update, rebuild, and test... etc... will try to get to that soonest... but the PR looks good...

I can see you are still trying to tie together the two sets of logic, W3C/MDM - Flow/Phrasing vs libTidy - ParseBlock/ParseInline... and do not quite agree with all the comments, in your sample... but, no problem...

It is important to have a reasonable sample set... and more welcome... actual use cases...

So along with the other 7, in my repo, in_848.html in_848-1.html in_848-11.html in_848-12.html in_848-2.html in_848-3.html in_848-4.html, added yours, as in_848-5.html... tks...

But am not sure what extra it adds... it is passed by both tidy-I-I.exe and the tidy-IBM-B.exe... so does not sway the argument either way... ParseBlock vs ParseInline, in tidy terms...

So, perhaps it is agreed, as it stands at the moment, if merged, libTidy can be too liberal with its acceptance of the tag slot... but maybe that could be left to a new issue, being raised later...

There is also the pending problem with the style tag, the fix-style-tags option, in in_848-4.html, but as stated, that can also be a separate issue...

It seems downright wrong to move style to the head, in this instance... aside from also causing bad rendering...

I don't exactly understand to turn off the automatic rewriting feature of tidy... you will have to explain this more... if you want to persue this...

So after I get around to a bit more testing, I think this is a good addition, to libTidy... thanks...

Look forward to further feedback, comment, patches, PR, etc on this... thanks...

@geoffmcl
Copy link
Contributor

@lhchavez sorry for the long delay on this... I have subsequently changed PCs, and am in the process of setting up the testing of this again...

At least one quick change in the PR, is where we add the TidyAttr_SLOT enum, in tidyenum.h! The list looks alphabetic, but read #851... The SO version of the library would need to be bumped almost everytime such a new option is added...

As further suggested there, I think the wording in OPTIONS.md, ATTRIBUTES.md, and probably others, needs to be reverted to state something like a new option must be added just before the last... ... maybe @balthisar could help with this?

Meantime, I will try to get back to testing slot... it is an important addition to libtidy...

Look forward to further feedback, comments, other patches, PR, etc, on this... thanks...

@geoffmcl
Copy link
Contributor

@lhchavez have build your forks slot branch, with the following patch, in line with #851...

diff --git a/include/tidyenum.h b/include/tidyenum.h
index 543a82e..8fe5f52 100644
--- a/include/tidyenum.h
+++ b/include/tidyenum.h
@@ -988,13 +988,13 @@ typedef enum
   TidyTag_OUTPUT,        /**< OUTPUT */
   TidyTag_PROGRESS,      /**< PROGRESS */
   TidyTag_SECTION,       /**< SECTION */
-  TidyTag_SLOT,          /**< SLOT */
   TidyTag_SOURCE,        /**< SOURCE */
   TidyTag_SUMMARY,       /**< SUMMARY */
   TidyTag_TEMPLATE,      /**< TEMPLATE */
   TidyTag_TIME,          /**< TIME */
   TidyTag_TRACK,         /**< TRACK */
   TidyTag_VIDEO,         /**< VIDEO */
+  TidyTag_SLOT,          /**< SLOT */
 
   N_TIDY_TAGS            /**< Must be last */
 } TidyTagId;
@@ -1144,7 +1144,6 @@ typedef enum
   TidyAttr_SHOWGRIDX,              /**< SHOWGRIDX= */
   TidyAttr_SHOWGRIDY,              /**< SHOWGRIDY= */
   TidyAttr_SIZE,                   /**< SIZE= */
-  TidyAttr_SLOT,                   /**< SLOT= */
   TidyAttr_SPAN,                   /**< SPAN= */
   TidyAttr_SRC,                    /**< SRC= */
   TidyAttr_SRCSET,                 /**< SRCSET= (HTML5) */
@@ -1344,7 +1343,8 @@ typedef enum
   TidyAttr_AS,                     /**< AS= */
    
   TidyAttr_XMLNSXLINK,             /**< svg xmls:xlink="url" */
-   
+  TidyAttr_SLOT,                   /**< SLOT= */
+
   N_TIDY_ATTRIBS                   /**< Must be last */
 } TidyAttrId;

If you agree, appreciate it being added to this PR... thanks...

I built your slot branch, and used it to run your forked regression tests, pr-848 branch, and it all ran fine, passing the new case-848.html...

And of course that case fails using the current 5.7.29, as expected... thanks...

Are there any other tests I need to run? I will look around for some slot usage examples...

So, if the above enum change, in the PR, is agreed, then it seems we are ready to go...

Look forward to further feedback, comments, other patches, PR, etc, on this... thanks...

@geoffmcl geoffmcl merged commit e51cd17 into htacg:next Nov 21, 2020
geoffmcl added a commit that referenced this pull request Nov 21, 2020
geoffmcl added a commit that referenced this pull request Nov 21, 2020
@geoffmcl
Copy link
Contributor

@lhchavez eventually got to it... now merged... thanks...

geoffmcl added a commit that referenced this pull request Nov 24, 2020
Error made when merging LOADING Is. #879, PR #902

And merging SLOT PR #848

Also added a BIG warning over attribute_defs table to try to avoid this
in future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants