Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add operator node to AST and parser #1499

Merged
merged 2 commits into from
Jul 15, 2024
Merged

Add operator node to AST and parser #1499

merged 2 commits into from
Jul 15, 2024

Conversation

alancai98
Copy link
Member

@alancai98 alancai98 commented Jul 2, 2024

Relevant Issues

Description

  • Adds an operator node to AST
  • Adds additional parsing rules for special character sequences that can become operators

Other Information

  • Updated Unreleased Section in CHANGELOG: [NO]

    • v1 branch
  • Any backward-incompatible changes? [YES]

    • Yes but on v1 branch
  • Any new external dependencies? [NO]

  • Do your changes comply with the Contributing Guidelines
    and Code Style Guidelines? [YES]

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 self-assigned this Jul 2, 2024
Copy link

github-actions bot commented Jul 2, 2024

Conformance comparison report-Cross Engine

Base (eval) legacy +/-
% Passing 87.12% 92.60% 5.48%
✅ Passing 5101 5421 320
❌ Failing 754 433 -321
🔶 Ignored 0 0 0
Total Tests 5855 5854 -1
Number passing in both: 4898

Number failing in both: 230

Number passing in eval engine but fail in legacy engine: 203

Number failing in eval engine but pass in legacy engine: 524
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.
524 test(s) were failing in eval but now pass in legacy. Before merging, confirm they are intended to pass.
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-EVAL

Base (35271b1) 9401231 +/-
% Passing 87.12% 87.12% 0.00%
✅ Passing 5101 5101 0
❌ Failing 754 754 0
🔶 Ignored 0 0 0
Total Tests 5855 5855 0
Number passing in both: 5101

Number failing in both: 754

Number passing in Base (35271b1) but now fail: 1

Number failing in Base (35271b1) but now pass: 1
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY
The following test(s) were previously failing but now pass. Before merging, confirm they are intended to pass:
Click here to see
  • Example 6 — Value Coercion, compileOption: LEGACY

Conformance comparison report-Cross Commit-LEGACY

Base (35271b1) 9401231 +/-
% Passing 92.60% 92.60% 0.00%
✅ Passing 5421 5421 0
❌ Failing 433 433 0
🔶 Ignored 0 0 0
Total Tests 5854 5854 0
Number passing in both: 5421

Number failing in both: 433

Number passing in Base (35271b1) but now fail: 0

Number failing in Base (35271b1) but now pass: 0

@alancai98 alancai98 force-pushed the operator-node-ast branch from c2846a1 to df1752b Compare July 2, 2024 23:29
@alancai98 alancai98 force-pushed the operator-node-ast branch from df1752b to 25408e4 Compare July 4, 2024 00:18
@alancai98 alancai98 changed the title [wip] Add operator node to AST and parser Add operator node to AST and parser Jul 4, 2024
@alancai98 alancai98 linked an issue Jul 4, 2024 that may be closed by this pull request
@alancai98 alancai98 marked this pull request as ready for review July 4, 2024 00:20
exprNot(expr)
}

private fun checkForInvalidTokens(op: ParserRuleContext) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional check for whitespace + comments between operator tokens. More context in this issue comment #1498 (comment)

@alancai98 alancai98 requested a review from RCHowell July 9, 2024 18:39
Comment on lines 622 to 627
otherOp
: OPERATOR
| AMPERSAND
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove AMPERSAND and just have op=OPERATOR on L630?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current ANTLR lexing and parsing logic is a bit tricky due to how we currently lex certain special characters that are also used by GPML match expressions. Currently, a single & gets lexed as AMPERSAND, which can have different meanings depending on its usage

  1. as an operator (e.g. 1 & 2)
  2. as part of a GPML match expression (e.g. MATCH (x:Label&OtherLabel)).

So this additional branch in the parsing of otherOp allows us to also define a single & as an operator.

I think we need to define the GPML patterns within the MATCH clause as a separate lexing mode (more on lexer modes). This will allow us to distinguish between & lexed as an operator node in normal expression contexts versus & lexed as part of a GPMLMATCH expression. For now, I'll cut an issue (#1512) and link in the lexer + parser.

Comment on lines 357 to 382
OPERATOR
// may not end with + or -
: OpBasic+ OpBasicEnd
// must include at least one of OpSpecial to end w/ anything
| (OpBasic | OpSpecial)* OpSpecial (OpBasic | OpSpecial)*
;

fragment OpBasic
: [+*=] // TODO support `<` and `>`?
// comments are not matched
| '-' {_input.LA(1) != '-'}?
| '/' {_input.LA(1) != '*'}?
;

fragment OpBasicEnd
: [*/=] // TODO support `<` and `>`?
;
fragment OpSpecial
: [~@#%^?] // TODO support backtick (`)?
// graph patterns are not matched
| '|' {_input.LA(1) != '!'}?
| '!' {_input.LA(1) != '%'}?
| '&' {_input.LA(1) != '%'}?
;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my initial read, this looks pretty good. It covers

  • Don't allow +, - at end because of number signs
  • Don't allow -- or /* because comments
  • Don't allow GPML patterns

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that's the idea for these ANTLR semantic predicates. we can always iterate in the future to add more restrictions on operators.

Comment on lines +158 to +160
"+" -> "pos"
"-" -> "neg"
else -> error("unsupported unary op $symbol")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about not !?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of ! as a SQL unary operator

@alancai98 alancai98 force-pushed the operator-node-ast branch from 25408e4 to 5ad1973 Compare July 15, 2024 20:06
@alancai98 alancai98 requested a review from RCHowell July 15, 2024 20:07
@alancai98 alancai98 merged commit ff03b0a into v1 Jul 15, 2024
14 checks passed
@alancai98 alancai98 deleted the operator-node-ast branch July 15, 2024 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v1] Support operator node within AST
2 participants