[Suggestion] Anchors for grammar rules? #602

DoctorBracewell · 2022-06-08T17:59:15Z

DoctorBracewell
Jun 8, 2022

Regular Expressions have the anchor characters ^ and $, which allow you to anchor the match at a certain position.

I'm currently running into an issue that could be solved with these, described below:

I want to use an arbitrary matching rule (let's say its ASCII_ALPHA_LOWER+), so there can be any non-zero number of lowercase letters one after each other, but I want to the result to not be equal to a specific string (say, "abc").
In regex, this can be achieved by using a negative lookahead combined with the anchor characters - ^(?!abc$)[a-z]+ because the anchors stop the negative lookahead as soon as it matches the abc, which means that a string like abcd would be matched.

This is important when, for example, defining "identifiers" in languages. You may want to define an identifier as a sequence of any characters that isn't a string from a set of reserved keywords.

Please correct me if I'm wrong and point out any potential methods of doing this with pest's grammar syntax, but currently I don't think this is possible because pest does not have a way of matching the "end of a rule", so I think these anchor characters could be a good addition.

CAD97 · 2022-06-08T18:42:10Z

CAD97
Jun 8, 2022
Maintainer

That regex doesn't do what you want it to do: https://regex101.com/r/W2UjRZ/1

The way to accomplish "identifier which is not a keyword" in pest is to use pest's lookahead functionality.

list = { (ident | kw)* }

WHITESPACE = { PATTERN_WHITE_SPACE }

kw = @{ "abc" ~ !ASCII_ALPHA }
ident = @{ !kw ~ ASCII_ALPHA+ }

input:

ab abc abcd

output:

- list
  - ident: "ab"
  - WHITESPACE: " "
  - kw: "abc"
  - WHITESPACE: " "
  - ident: "abcd"

Here kw and ident are mutually exclusive rules, so it doesn't matter which order you check them in.

An alternative that I personally prefer is to keep the guard on keywords that they don't subslice an identifier, but to allow the ident rule to match reserved words, and lint/error on the use of reserved words later.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Suggestion] Anchors for grammar rules? #602

{{title}}

Replies: 1 comment

{{title}}

Select a reply

[Suggestion] Anchors for grammar rules? #602

DoctorBracewell Jun 8, 2022

Replies: 1 comment

CAD97 Jun 8, 2022 Maintainer

DoctorBracewell
Jun 8, 2022

CAD97
Jun 8, 2022
Maintainer