Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: unify aggregate and window functions in window handling #170

Merged
merged 8 commits into from
Aug 30, 2023

Conversation

vbarua
Copy link
Member

@vbarua vbarua commented Aug 28, 2023

feat: allow window function operands beyond just column references
fix: invocation was not set when building WindowFunction proto

BREAKING CHANGES:

  • Window and WindowFunction have been merged into WindowFunctionInvocation
  • WindowFunctionInvocation is now an inner class of expression
  • WindowFunctionInvocation now implements Expression.
  • Expression visitor now visit WindowFunctionInvocation
  • WindowFunctionConvert constructor requires fewer arguments
  • windowCreator has additional fields
  • POJO WindowBound offset must now be a long, better reflecting spec

public abstract static class WindowFunction {
public abstract WindowFunctionInvocation getFunction();

public abstract Optional<Expression> getPreMeasureFilter();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preMeasureFilter field was not used anywhere, and the WindowFunction message cannot actually encode a pre-measure filter (for now).

I opted to remove it entirely. Support can be added in once the spec supports it.

@@ -45,7 +45,7 @@ public BoundedKind boundedKind() {

public abstract Direction direction();

public abstract Expression offset();
public abstract long offset();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spec, offset must be int64.

The POJO layer should reflect this restriction.

import org.immutables.value.Value;

@Value.Immutable
public abstract class WindowFunctionInvocation {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WindowFunctionInvocation is a / implements Expression, much like ScalarFunctionInvocation.

Moved this to an inner class of Expression for consistency.


List<WindowFunctionVariant> allWindowFunctionVariants =
Stream.concat(windowFunctionVariants, windowAggFunctionVariants)
.collect(Collectors.toList());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec is being loosed to allow usage of aggregate functions as window function.
substrait-io/substrait#540

substrait-java already supported this behaviour, but the implementation is simpler/cleaner if aggregate functions are added as valid window functions here.

@vbarua vbarua force-pushed the vbarua/agg-fns-in-windows branch 2 times, most recently from 03a9e26 to 3c733be Compare August 28, 2023 20:48
public abstract Aggregate.Measure aggregateFunction();

@Nullable
public abstract WindowFunction windowFunction();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having separate handling for window calls with window functions, and window calls with aggregate functions, this PR updates the extension handling to treat all aggregate functions as valid window functions and a such removes the need for this distinction.

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import org.apache.calcite.rel.RelNode;
import org.apache.calcite.rex.*;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding * imports

public Expression.Window convert(
RelNode input,
Type.Struct inputType,
public Optional<Expression.WindowFunctionInvocation> convert(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public Optional<Expression.WindowFunctionInvocation> convert

This output types brings convert inline with both ScalarFunctionConverter and AggregateFunctionConverter.

* </ul>
*/
@Beta
protected void assertFullRoundTrip(String sqlQuery, List<String> createStatements)
Copy link
Member Author

@vbarua vbarua Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertProtoPlanRoundrip first converts from

SQL -> Calcite -> POJO -> Proto

and then starting from the proto does

Proto -> Pojo -> Proto

but this means that if the initial Proto -> Pojo translation is missing fields, the test doesn't detect them.

This assert helped identify some issues when processing window functions. My intent is:

  1. make this assert strong by checking more than just Pojo -> Proto -> Pojo
  2. replace all calls of assertProtoPlanRoundrip with

Marked as BETA for now as 1 is not done yet. I thought about doing it in this PR, which would requires adding support for mapping window functions from substrait into Calcite, but this is already big enough as is.

null,
RelCollations.EMPTY,
over.getType(),
sqlAggFunction.getName());
Copy link
Member Author

@vbarua vbarua Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usage of this API meant that calls like ntile(4) were not supported, as 4 would be encoded as a RexLiteral which is not a RexSlot. Switching away from this handling enables those queries, and in fact I was able to enable the test for the ntile query in WindowFunctionTest.

Copy link
Member

@EpsilonPrime EpsilonPrime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good set of changes. I especially like how bounds and operands are much cleaner (as opposed to invoking a potentially unrelated builder in place).

@vbarua vbarua marked this pull request as ready for review August 28, 2023 21:19
@vbarua vbarua merged commit afa1213 into main Aug 30, 2023
8 checks passed
@vbarua vbarua deleted the vbarua/agg-fns-in-windows branch August 30, 2023 18:42
vbarua added a commit that referenced this pull request Sep 6, 2023
feat: allow window function operands beyond just column references
fix: invocation was not set when building WindowFunction proto message

BREAKING CHANGE:
* Window and WindowFunction have been merged into WindowFunctionInvocation
* WindowFunctionInvocation is now an inner class of expression
* WindowFunctionInvocation now implements Expression.
* Expression visitor now visits WindowFunctionInvocation
* WindowFunctionConvert constructor requires fewer arguments
* windowCreator has additional fields
* POJO WindowBound offset must now be a long, better reflecting spec
ajegou pushed a commit to ajegou/substrait-java that referenced this pull request Mar 29, 2024
…bstrait-io#170)

feat: allow window function operands beyond just column references
fix: invocation was not set when building WindowFunction proto message

BREAKING CHANGES:
* Window and WindowFunction have been merged into WindowFunctionInvocation
* WindowFunctionInvocation is now an inner class of expression
* WindowFunctionInvocation now implements Expression.
* Expression visitor now visits WindowFunctionInvocation
* WindowFunctionConvert constructor requires fewer arguments
* windowCreator has additional fields
* POJO WindowBound offset must now be a long, better reflecting spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants