Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar parser does not correctly recognize pug rule #82

Closed
alexr00 opened this issue Jan 25, 2019 · 4 comments · Fixed by #85
Closed

Grammar parser does not correctly recognize pug rule #82

alexr00 opened this issue Jan 25, 2019 · 4 comments · Fixed by #85
Assignees
Labels
bug Issue identified by VS Code Team member as probable bug
Milestone

Comments

@alexr00
Copy link
Member

alexr00 commented Jan 25, 2019

Simple example:

li:	custom-link.has-text-primary(to="/")

Grammar is here: https://github.com/davidrios/pug-tmbundle

The li is correctly categorized as entity.name.tag.pug. custom-link is incorrectly categorized as text.pug If you copy the same grammar into a different editor and use their text mate engine, custom-link is correctly categorized as entity.name.tag.pug.

In the grammar, this is the rule that is not getting applied correctly:

            "tag_name": {
			"begin": "([#!]\\{(?=.*?\\}))|(\\w(([\\w:-]+[\\w-])|([\\w-]*)))",
			"end": "(\\G(?<!\\5[^\\w-]))|\\}|$",
			"name": "meta.tag.other entity.name.tag.pug",
			"patterns": [
				{
					"begin": "\\G(?<=\\{)",
					"end": "(?=\\})",
					"name": "meta.tag.other entity.name.tag.pug",
					"patterns": [
						{
							"match": "{",
							"name": "invalid.illegal.tag.pug"
						},
						{
							"include": "source.js"
						}
					]
				}
			]
		},

Originally found in microsoft/vscode#65983

@msftrncs
Copy link
Contributor

As far as I can tell, this rule is working exactly as per its written.

The rule as written, captures the li and then immediately ends, which appears to be correct because the inner rules do not appear to apply anyway, in this case. It does not consume the :.

The way I read the rules, there is no way this can match because nothing consumes the colon. If the rule at line 333 wouldn't match, then the syntax would loop around and cause the next part to rematch through the above rule ('tag_name')

		"begin": "\\G(?=(#[^\\{\\w-])|[^\\w.#])",
		"end": "$",
		"comment": "Line starting with characters incompatible with tag name/id/class is standalone text.",

Oh Bingo! I have been looking for this exact case. After the begin rule of 'tag_name' the anchor is set at character position 2. This anchor position is not lost when the END rule matches, but the previous rule's anchor position was character 0 (start of string). I have hypothesized for a while that TextMate pushes the anchor position when they push rules, and pop it when they pop a rule. In this case, the \G from the rule at line 333 would not match because the original anchor position of 0 would have been restored, and not match the current character position of 2, and instead the rule at line 549, 'complete_tag' would, which would be what seems to be expected.

Can someone confirm this with TextMate? (I don't have a Mac) I had previously mentioned this in #49, @infininight, any input?

@msftrncs
Copy link
Contributor

msftrncs commented Jan 26, 2019

Now this isn't the only issue that was displayed in microsoft/vscode#65983, the other appears to be a different problem.

In that case:

p.
  This is a very long and boring paragraph that spans multiple lines.
  Suddenly there is a #[strong strongly worded phrase] that cannot be
  #[em ignored].

The tag strong and em bleeds its scope in to the rest of the block.

Trying to find this, I noticed that 2 rules at 1 point both have the same match text. a rule at 848, and the rule included at 852, which appears at 764 (inline_pug_text). This issue has been brought up before:

			"begin": "",
			"end": "(?=\\])",

Its an empty begin rule! I hypothesize that TextMate allows these, VS Code TextMate does not, and it instead causes some issues, because at this point, I see 6 more rules in the debug, but 'inline_pug_text' should have been the last. (seen below as '- 1017') The lines below it are actually its 'patterns'. I have run in to this before when I have commented out a 'begin' block as well.

@@scanNext 30,{30}: | strongly worded phrase] that cannot be\n|
  scanning for
   - -1: (\])
   - 1008: (?<!\\)(#\[)
   - 980: ((?:mixin\s+)|\+)([\w-]+)
   - 1013: (?<!\])(?=[\w.#])|(:\s*)
   - 977: (-|(([a-zA-Z0-9_]+)\s+=))
   - 1004: (!?\=)\s*
   - 1015: \[
   - 1017: \[
   - 1008: (?<!\\)(#\[)
   - 1018: (?=<[^>]*>)
   - 1020: (&)([a-zA-Z0-9]+|#[0-9]+|#x[0-9a-fA-F]+)(;)
   - 1021: [<>&]
   - 965: (?<!\\)[#!]\{(?=.*?\})
   - 967: (?<!\\)[#!]\{(?=[^}]*$)
matched: 31 / 31
  token: | |
      * text.pug
      * text.block.pug
      * inline.pug
  pushing BeginEndRule#1013 @ pug.tmLanguage.json:813 - (?<!\])(?=[\w.#])|(:\s*)

I think VS Code TextMate needs to support the empty but present 'begin' rule, and should also support a
non-existent 'end' rule on a present 'begin' rule, as that has been brought up before as well.

EDIT: It appears it does already allow non-existent 'end' rules, as I saw an 'undefined' for the end pattern while working on a syntax, I had forgotten the 'end'.

@neilsoult
Copy link

Is anyone else still seeing the incorrect formatting? I'm on Version 1.34.0 (1.34.0) and this doesn't appear to be fixed for me

@msftrncs
Copy link
Contributor

msftrncs commented May 30, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue identified by VS Code Team member as probable bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants