Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for aggregations #1367

Merged
merged 6 commits into from
Feb 26, 2024
Merged

Adds support for aggregations #1367

merged 6 commits into from
Feb 26, 2024

Conversation

johnedquinn
Copy link
Member

@johnedquinn johnedquinn commented Feb 9, 2024

Relevant Issues

  • None

Description

  • Adds support for aggregations by largely leveraging much of the existing functionality from the PlannerPipeline
  • Adds support for COLL_AGGs by leveraging the above aggregations. We take advantage of argument coercion and only expect BAGs as our first parameter. We return ANY in these cases because we don't support parameterized types yet. Therefore, the dynamic case is the only case.
  • This PR also takes into consideration the direct use of grouping expressions in the projection list.
  • Fixes the NormalizeGroupBy normalization pass.
  • An important thing to note is that SPI now has the ability to handle qualified aggregations, though, at the AST to Plan level, we are limiting the aggregations to the SQL:1999 aggregations (MAX, MIN, etc). We can change this whenever we want.
  • Fixes dynamic dispatch as well.

Unresolved Questions

  • I've got a question regarding return types for SUM/AVG. I went ahead with what was left before me, but SQL:1999, specifically for SUM/AVG, says that we can choose the precision and scale (with some constraints):

If SUM or AVG is specified, then:
a) DT shall be a numeric type or an interval type.
b) If SUM is specified and DT is exact numeric with scale S, then the declared type of the result is exact numeric with implementation-defined precision and scale S.
c) If AVG is specified and DT is exact numeric, then the declared type of the result is exact numeric with implementation-defined precision not less than the precision of DT and implementation-defined scale not less than the scale of DT.
d) If DT is approximate numeric, then the declared type of the result is approximate numeric with implementation-defined precision not less than the precision of DT.
e) If DT is interval, then the declared type of the result is interval with the same precision as DT.

Other Information

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link

github-actions bot commented Feb 9, 2024

Conformance comparison report-Cross Engine

Base (eval) legacy +/-
% Passing 77.52% 92.47% 14.95%
✅ Passing 4510 5380 870
❌ Failing 1308 438 -870
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 4341

Number failing in both: 269

Number passing in eval engine but fail in legacy engine: 169

Number failing in eval engine but pass in legacy engine: 1039
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.
1039 test(s) were failing in eval but now pass in legacy. Before merging, confirm they are intended to pass.
The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-EVAL

Base (f1aeb6f) 87c2530 +/-
% Passing 59.42% 77.52% 18.10%
✅ Passing 3457 4510 1053
❌ Failing 2361 1308 -1053
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 3455

Number failing in both: 1306

Number passing in Base (f1aeb6f) but now fail: 2

Number failing in Base (f1aeb6f) but now pass: 1055
⁉️ CONFORMANCE REPORT REGRESSION DETECTED ⁉️. The following test(s) were previously passing but now fail:

Click here to see
  • PG_JOIN_04, compileOption: PERMISSIVE
  • PG_JOIN_04, compileOption: LEGACY
1055 test(s) were previously failing but now pass. Before merging, confirm they are intended to pass The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Conformance comparison report-Cross Commit-LEGACY

Base (f1aeb6f) 87c2530 +/-
% Passing 92.47% 92.47% 0.00%
✅ Passing 5380 5380 0
❌ Failing 438 438 0
🔶 Ignored 0 0 0
Total Tests 5818 5818 0
Number passing in both: 5380

Number failing in both: 438

Number passing in Base (f1aeb6f) but now fail: 0

Number failing in Base (f1aeb6f) but now pass: 0

@johnedquinn johnedquinn force-pushed the partiql-eval-agg branch 2 times, most recently from 5ee9c0b to 3a4a363 Compare February 12, 2024 19:56
@codecov-commenter
Copy link

codecov-commenter commented Feb 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (partiql-eval@f1aeb6f). Click here to learn what that means.

Additional details and impacted files
@@               Coverage Diff               @@
##             partiql-eval    #1367   +/-   ##
===============================================
  Coverage                ?   50.32%           
  Complexity              ?     1045           
===============================================
  Files                   ?      165           
  Lines                   ?    13129           
  Branches                ?     2452           
===============================================
  Hits                    ?     6607           
  Misses                  ?     5862           
  Partials                ?      660           
Flag Coverage Δ
CLI 13.77% <ø> (?)
EXAMPLES 80.28% <ø> (?)
LANG 54.71% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@johnedquinn johnedquinn changed the base branch from partiql-eval to partiql-eval-correlated February 15, 2024 21:26
Base automatically changed from partiql-eval-correlated to partiql-eval February 19, 2024 22:32
@johnedquinn johnedquinn changed the base branch from partiql-eval to main February 21, 2024 19:02
@johnedquinn johnedquinn changed the base branch from main to partiql-eval February 21, 2024 19:02
@johnedquinn johnedquinn force-pushed the partiql-eval-agg branch 2 times, most recently from f945f87 to 45315e4 Compare February 21, 2024 19:06
Adds support for COLL_AGGs
return userInputPath.steps.size + actualAbsolutePath.size - pathSentToConnector.steps.size
}

@OptIn(FnExperimental::class, PartiQLValueExperimental::class)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here till EOF, this is copied from PathResolverAgg.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially this whole file comes from org.partiql.lang.eval.physical.operators.Accumulator.kt

@johnedquinn johnedquinn changed the title [DRAFT] Adds support for aggregations Adds support for aggregations Feb 21, 2024
@johnedquinn johnedquinn marked this pull request as ready for review February 21, 2024 23:09
Copy link
Contributor

@yliuuuu yliuuuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the query :

SELECT * FROM <<{'a': 1, 'b':2}, {'a':2, 'b':3}>> as tbl group by a

With the current implementation, this PR would throw an evaluationError.

The suspect cause is the NormalizedSelect Pass:
The NormalizeSelect Pass would rewrite the the query to be

SELECT VALUE TUPLEUNION(
    CASE WHEN tbl IS STRUCT THEN tbl ELSE {_1: tbl}
FROM tbl GROUP BY a
)

@johnedquinn johnedquinn requested a review from yliuuuu February 26, 2024 21:47
Copy link
Contributor

@yliuuuu yliuuuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@johnedquinn johnedquinn merged commit a8cda41 into partiql-eval Feb 26, 2024
10 checks passed
@johnedquinn johnedquinn deleted the partiql-eval-agg branch February 26, 2024 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants