-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Optimize IsNotNullExpr #11586
perf: Optimize IsNotNullExpr #11586
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @andygrove -- I think it would be good to add basic test coverage for Union
but otherwise this looks great to me
}; | ||
compute::not(&is_null).map_err(Into::into) | ||
} else { | ||
compute::is_not_null(array.as_ref()).map_err(Into::into) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes faster because it calls a single kernel (compute::is_not_null
) rather than 2 (is_null
and not
)?
Could we add some basic tests for union? Perhaps following the model in #11321 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes faster because it calls a single kernel (
compute::is_not_null
) rather than 2 (is_null
andnot
)?
Yes, exactly. It avoids creating an interim vector that is then discarded.
Could we add some basic tests for union? Perhaps following the model in #11321 ?
We do already have at least one test for IS NOT NULL
for union, that was added in #11321.
There is no functional change for union in this PR. The code in compute_is_not_null
for union is copied from the compute_is_null
method, and adds a call to not
, so it is doing the same thing as before but the flow changed a little.
Union is the only case that this PR does not optimize for, because I didn't want to mess with the temporary workaround that is in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb I pushed a change to simplify the PR and remove the duplicated union code. Let me know if that makes things clearer (or not).
@@ -117,6 +117,21 @@ pub(crate) fn compute_is_null(array: ArrayRef) -> Result<BooleanArray> { | |||
} | |||
} | |||
|
|||
/// workaround <https://github.com/apache/arrow-rs/issues/6017>, | |||
/// this can be replaced with a direct call to `arrow::compute::is_not_null` once it's fixed. | |||
pub(crate) fn compute_is_not_null(array: ArrayRef) -> Result<BooleanArray> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code looks good to me -- I think it also ends up supporting UnionArray
in IS NOT NULL
cc @samuelcolvin who added this code in #11321
/// this can be replaced with a direct call to `arrow::compute::is_not_null` once it's fixed. | ||
pub(crate) fn compute_is_not_null(array: ArrayRef) -> Result<BooleanArray> { | ||
if array.as_any().is::<UnionArray>() { | ||
compute::not(&compute_is_null(array)?).map_err(Into::into) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not optimize the union handling - this still calls is_null
and not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is lgtm, thanks @andygrove
For UNION all we can do a followup PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me -- thanks everyone!
Which issue does this PR close?
N/A
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?