Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove struct UDF, and use named_struct everywhere #9839

Closed
alamb opened this issue Mar 28, 2024 · 2 comments · Fixed by #9897
Closed

Remove struct UDF, and use named_struct everywhere #9839

alamb opened this issue Mar 28, 2024 · 2 comments · Fixed by #9897
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@alamb
Copy link
Contributor

alamb commented Mar 28, 2024

Is your feature request related to a problem or challenge?

This is a follow on to #9743 where @gstvg added a great named_struct function to construct StructArrays ❤️

As part of that PR, @yyy1000 noted that the existing code in the struct udf is now never called: #9743 (comment)

Describe the solution you'd like

  1. Make the invoke()` function reutrn a not yet implemented error https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/struct.rs#L90-L92

  2. Implement the simplify API to rewrite calls to struct() to a call to named_struct

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/expr/src/udf.rs#L372-L378

  1. Update the sql planner to call struct rather than building up the c0, `c1, etc and calling named_struct

Describe alternatives you've considered

We could also just remove the struct udf entirely, though in that case it is important to keep the struct expr_fn function for backwards compatibility

https://github.com/apache/arrow-datafusion/blob/ce3d446be5f6a11664e100fc47940e6ecb5418d3/datafusion/functions/src/core/mod.rs#L44

I think it could be implemented as its own function like

Additional context

No response

@alamb alamb added the enhancement New feature or request label Mar 28, 2024
@alamb alamb changed the title Remove struct UDF Remove struct UDF, and use named_struct everywhere Mar 28, 2024
@alamb alamb added the good first issue Good for newcomers label Mar 30, 2024
@alamb
Copy link
Contributor Author

alamb commented Mar 30, 2024

I think this is a good first issue as it is well specified, and there are patterns to follow

@alamb alamb self-assigned this Apr 1, 2024
@alamb
Copy link
Contributor Author

alamb commented Apr 1, 2024

This ended up causing issues as the column name is different (e.g. see #9891). I am going to make a quick fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant