-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MINOR]: Rename get_arrayref_at_indices to take_arrays #12654
Conversation
arrays: &[ArrayRef], | ||
indices: &PrimitiveArray<UInt32Type>, | ||
) -> Result<Vec<ArrayRef>> { | ||
pub fn take_arrays(arrays: &[ArrayRef], indices: &dyn Array) -> Result<Vec<ArrayRef>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering how it differs from arrow::compute::take
, looks like the same but for 2dim arrays instead of a single array.
What I'm thinking should we move this method to arrow-rs kernel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, this util is just to abstract away same outer iteration for the compute::take
. If community thinks, this is beneficial, we can move this to arrow-rs
. I think, this pattern is common enough to move it to arrow-rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to move to arrow. And maybe add a not int he docs for RecordBatch
pointing people at it
It is common enough that it would be nice not to have to write it in each downstream crate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @akurmustafa
I feel we can move it to arrow-rs as for me it is natural API evolvement on top of take
which is used by downstream projects.
Another thing came to my mind we probably want to optimize take
as it likely a bottleneck for SMJ we found in apache/datafusion-comet#901 (comment)
Figuring out how to make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @akurmustafa and @comphead
By the way, I opened a PR to move this util to |
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
As in the suggestion, this PR renames the util function
get_arrayref_at_indices
totake_arrays
.Also, type of the
indices
parameter is changed from&PrimitiveArray<UInt32Type>
to&dyn Array
to make it similar tocompute::take
's API.Are these changes tested?
Are there any user-facing changes?