-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use struct
instead of named_struct
when there are no aliases
#9897
Conversation
@@ -2288,39 +2288,39 @@ select struct(time,load1,load2,host) from t1; | |||
|
|||
# can have an aggregate function with an inner coalesce | |||
query TR | |||
select t2.info['c3'] as host, sum(coalesce(t2.info)['c1']) from (select struct(time,load1,load2,host) as info from t1) t2 where t2.info['c3'] IS NOT NULL group by t2.info['c3'] order by host; | |||
select t2."struct(t1.time,t1.load1,t1.load2,t1.host)"['c3'] as host, sum(coalesce(t2."struct(t1.time,t1.load1,t1.load2,t1.host)")['c1']) from (select struct(time,load1,load2,host) from t1) t2 where t2."struct(t1.time,t1.load1,t1.load2,t1.host)"['c3'] IS NOT NULL group by t2."struct(t1.time,t1.load1,t1.load2,t1.host)"['c3'] order by host; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reverts the change from #9894
.iter() | ||
.any(|value| matches!(value, SQLExpr::Named { .. })) | ||
{ | ||
self.create_named_struct(values, input_schema, planner_context) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the values in create_named_struct
will be SQLExpr::Named
in this PR, so maybe we can also make a change here?
https://github.com/apache/arrow-datafusion/blob/d8d521ac8b90002fa0ba1f91456051a9775ae193/datafusion/sql/src/expr/mod.rs#L610-L631
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ignore this. One case is only some values are Named so no change is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @alamb 1 improvement I can see here is we over iterating through values.
I mean 1 iteration is do identify named/not_named and when creating struct/named_struct its another one. prob it can be optimized
I agree it does two iterations over the value. I think the current approach basically requires two iterations because if we are doing So in other words, I would like to spend my optimization budget on bigger things initially if that is ok (specifically #9637) |
Thanks again for the reviews 🙏 |
…che#9897) * Revert "use alias (apache#9894)" This reverts commit 9487ca0. * Use `struct` instead of `named_struct` when there are no aliases * Update docs * fmt
Which issue does this PR close?
Closes #9839
Rationale for this change
The names of the columns created by callling
struct
changed in #9743Before #9743
After #9743
(note how the column name is different)
This change caused #9891 as well as removed test coverage for
struct
What changes are included in this PR?
struct
when there are no named fieldsAre these changes tested?
Yes, by existing tests
Are there any user-facing changes?
The column names are now the same as they were prior to #9894