Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clamp heading level to 6 when outputting HTML #176

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

GarrettAlbright
Copy link

Given the following input:

####### Hello!

The Lua code will gladly produce:

<section id="Hello">
<h7>Hello!</h7>
</section>

The JavaScript code does as well. <h8> and so on are also possible with no maximum I can find either in code or in the standard.

Only <h1> through <h6> are valid HTML. It may be useful for Djot to support heading levels higher than 6 when it is used to construct documents that support such, though in the interest of reducing ambiguity in the spec I'd suggest that some maximum is picked and I think that 6 is a reasonable number. All that being said, all this PR does is tweak html.lua such that if a heading level higher than 6 is encountered, only an <h6> is outputted.

Unfortunately I will not be submitting a PR for djot.js as I do not have a setup for transpiling and testing TypeScript, but I suspect a patch for that repo will be as trivial as this one.

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Dec 28, 2022

I don't think silently treating it as <h6> is a good idea, because the intent might have been that the heading be a subheading of the prior H6, not a sibling, and this incorrect "correction" would likely go unnoticed if the doc is long (as is likely in this case).

If djot doesn't support warning messages, it would be better to output the text as-is (i.e. ####### Hello!), which is very noticeable. Even leaving it as <h7> would be better, as it renders as a plain bock of text that would appear within the previous H6 heading, and would be reported by any HTML validation as bad HTML. But I think outputting ####### Hello! or <p>####### Hello!<p> would be best.

@jgm
Copy link
Owner

jgm commented Dec 29, 2022

This raises the question whether we should put the check at the parser level, so that seven #s just don't form a heading node, instead of dealing with it in the renderer. It could be that other formats allow > 6 levels of headings, but I'm not sure that's a reason by itself...

@clarfonthey
Copy link

clarfonthey commented Dec 29, 2022

I personally would imagine that anything using more than six levels of headers is probably generating those headers outside of a language like djot and therefore djot should be fine limiting itself in the way HTML has, since most people will be limited by HTML otherwise.

So, whether it's reasonable to have that many headers wouldn't have to be addressed: just whether it's reasonable for someone using djot should, and I think the answer is probably no.

Like maybe some physical books might be doing that, but will you be writing your entire book in djot? Probably not. If it's a website, something like that will be split into multiple pages.

@jgm
Copy link
Owner

jgm commented Dec 29, 2022

Pandoc will gladly treat ######### hello as a heading, but it will render in HTML as <p class="heading">.

@bpj
Copy link

bpj commented Dec 29, 2022

My immediate thought when I saw this was "hey, djot isn't meant to be HTML-centric!" IOW I think the parser should probably produce heading nodes at any level since someone's output format might support that. I'm not saying that it's likely but HTML's cutoff at six is nonetheless arbitrary; some formats cut off at a lower level (e.g. Perl Pod at four) and I think that in principle the door should be left open for higher levels.

The/an HTML renderer is another matter: it should do something when encountering a heading higher than 6. I think I'm most in favor of something like <p class="high-heading heading-7"> on the assumption that the author may have intended the heading and may want to style it, but if it was unintended a CSS rule like .high-heading { background-color: red; } will make it easy to spot. Another possibility is to render sections > 6 as possibly nested definition lists, again with classes like <dl class="heading-level-7"> and <dt class="heading heading-7"> to allow styling both of intentional cases and for detection of unintentional cases. Of course programmatic HTML validation might also detect these tag-class combinations, and a renderer might have an option like --high-heading=p|dl|warn|error.

@uvtc
Copy link
Contributor

uvtc commented Dec 29, 2022

I like the idea of djot producing <p class="heading-level-7">hey</p>. If the writer isn't seeing the output they expected from ####### hey, they can always check the html source output.

@marrus-sh
Copy link

WAI-ARIA provides an aria-level attribute for indicating heading levels (among other things): https://www.w3.org/TR/wai-aria/#aria-level

<div role="heading" aria-level="7"> would be semantic and not too hard to match with CSS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants