Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1] Fix unary minus with numeric literal parsing; add data exception for unary neg overflow #1718

Merged
merged 4 commits into from
Jan 17, 2025

Conversation

alancai98
Copy link
Member

Relevant Issues

Description

  • Fixes unary minus w/ numeric literal parsing -- now will parse as a signed int literal rather than unary minus + unsigned int literal
    • Previously, it would be impossible to parse -2147483648 as an INT4
  • Add data exception for unary negation overflow

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alancai98 alancai98 requested a review from johnedquinn January 16, 2025 00:28
@alancai98 alancai98 self-assigned this Jan 16, 2025
Copy link

github-actions bot commented Jan 16, 2025

CROSS-ENGINE-REPORT ❌

BASE (LEGACY-V0.14.8) TARGET (EVAL-D60B65C) +/-
% Passing 89.67% 95.10% 5.43% ✅
Passing 5287 5607 320 ✅
Failing 609 51 -558 ✅
Ignored 0 238 238 🔶
Total Tests 5896 5896 0 ✅

Testing Details

  • Base Commit: v0.14.8
  • Base Engine: LEGACY
  • Target Commit: d60b65c
  • Target Engine: EVAL

Result Details

  • ❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
  • Passing in both: 2638
  • Failing in both: 17
  • Ignored in both: 0
  • PASSING in BASE but now FAILING in TARGET: 7
  • PASSING in BASE but now IGNORED in TARGET: 109
  • FAILING in BASE but now PASSING in TARGET: 180
  • IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 7 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see
  1. repeatingDecimal, compileOption: PERMISSIVE
  2. repeatingDecimalHigherPrecision, compileOption: PERMISSIVE
  3. subtractionOutOfAllowedPrecision, compileOption: PERMISSIVE
  4. inPredicateWithTableConstructor, compileOption: PERMISSIVE
  5. notInPredicateWithTableConstructor, compileOption: PERMISSIVE
  6. More than one character given for ESCAPE, compileOption: PERMISSIVE
  7. substring invalid quantity, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

180 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-D60B65C). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Comment on lines +107 to +108
),
// Make sure we parse Integer.MIN_VALUE as an INT rather than BIGINT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this work also have a test for the following: --${Integer.MIN_VALUE}. This should fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading of the PR, this previously would fail and this might make it "pass" (though that would be wrong. If my assumption is correct, a different approach would be to update our ANTLR file instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this work also have a test for the following: --${Integer.MIN_VALUE}. This should fail.

I assume you mean something like - -2147483648? --${Integer.MIN_VALUE} would be ---2147483648 which is parsed as a comment. Added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading of the PR, this previously would fail and this might make it "pass" (though that would be wrong. If my assumption is correct, a different approach would be to update our ANTLR file instead.

The query - -2147483648 would actually pass previously and return back 2147483648. - -2147483648 would get parsed as a two nested unary minuses with an unsigned INT_NUM literal 2147483648 (i.e. same thing as -(-(2147483648))). This would get converted into the plan as a two nested unary neg with a big int value of 2147483648, which evaluates to 2147483648.

With this PR's change, - -2147483648 would now give the data exception. - -2147483648 now gets parsed as a single unary minus with a signed INT_NUM literal -2147483648 (i.e. same thing as -(-2147483648). Converting that into the plan would give a unary neg with a int value of -2147483648, which will give an error when evaluated.


For the other comment about the ANTLR rule, I played around with this a while back with the literal modeling and couldn't seem get all the cases to parse correctly. For now, modifying it in the ANTLR visitor seems good enough but I added a TODO to look into it more in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies. - - - 2147483648

Copy link
Member Author

@alancai98 alancai98 Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- - - 2147483648

Error behavior is the same as - - 2147483648

  • both would fail in this PR
  • previously, they would succeed. - - - 2147483648 => - 2147483648 and - - 2147483648 => 2147483648

I can add another test for regressions though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- - - 2147483648 = -(-(-2147483648)). Which, should fail. With the original commit, I believe this wouldn't fail. With your latest commit, I think it will fail 👍 . Though, a test would give me confidence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good catch, the original commit had a bug (should only add the - sign to a literal numeric string if and only if there was no prior - added).

Added the triple - test in 7df4329 (#1718).

Comment on lines +107 to +108
),
// Make sure we parse Integer.MIN_VALUE as an INT rather than BIGINT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my reading of the PR, this previously would fail and this might make it "pass" (though that would be wrong. If my assumption is correct, a different approach would be to update our ANTLR file instead.

@alancai98 alancai98 requested a review from johnedquinn January 16, 2025 22:30
johnedquinn
johnedquinn previously approved these changes Jan 16, 2025
Copy link
Member

@johnedquinn johnedquinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🥇

@alancai98 alancai98 merged commit aa1589c into main Jan 17, 2025
14 checks passed
@alancai98 alancai98 deleted the change-unary-minus-parsing branch January 17, 2025 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants