-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support unparsing implicit lateral UNNEST
plan to SQL text
#13824
Conversation
@phillipleblanc or @sgrebnov I wonder if you have interest/time to help review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, I have a couple of questions
datafusion/sql/src/unparser/plan.rs
Outdated
/// Try to match the pattern `Expr::Alias(Expr::Column("__unnest_placeholder(outer_ref(...)))")` | ||
/// | ||
/// `outer_ref` is the display result of [Expr::OuterReferenceColumn] | ||
fn is_unnest_placeholder_with_outer_ref(expr: &Expr) -> (bool, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we define an enum that we return here? That is clearer to read than two booleans and we can limit it to reference only the possible states (i.e. it doesn't look like (false, true)
can happen, but its something a caller would need to handle from the type system)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enum
sounds like a good idea. I will try it. Thanks 👍
datafusion/sql/src/unparser/plan.rs
Outdated
if let Expr::Alias(Alias { expr, .. }) = expr { | ||
if let Expr::Column(Column { name, .. }) = expr.as_ref() { | ||
return name.starts_with(UNNEST_PLACEHOLDER); | ||
if let Some(prefix) = name.strip_prefix(UNNEST_PLACEHOLDER) { | ||
return (true, prefix.starts_with("(outer_ref(")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge fan of this string matching. At least the UNNEST_PLACEHOLDER is a shared const, but the outer_ref
could potentially change and we would miss it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel very strongly about this since it does seem unlikely to change though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge fan of this string matching.
I agree that string matching isn't a good way but I think it's the only way to recognize if the outer reference column is exits in the unnest plan currently. 🤔
At least the UNNEST_PLACEHOLDER is a shared const, but the
outer_ref
could potentially change and we would miss it here.
Maybe we can use something like UNNEST_PLACEHOLDER
to create a global constant for outer_ref
. At least then, we won't miss it when someone needs to change it.
datafusion/datafusion/expr/src/expr.rs
Line 2546 in ede665b
Expr::OuterReferenceColumn(_, c) => write!(f, "outer_ref({c})"), |
return (plan, None); | ||
} | ||
if let Some(Expr::Alias(alias)) = projection.expr.first() { | ||
if alias.expr.schema_name().to_string().starts_with("UNNEST(") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not an expression of type Unnest
? I don't think we need to do this string allocation here just to check that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not an Expr::Unnest
. The full plan is
/// SubqueryAlias: u
/// Subquery:
/// Projection: UNNEST(outer_ref(t1.c1)) AS c1
/// Projection: __unnest_placeholder(outer_ref(t1.c1),depth=1) AS UNNEST(outer_ref(t1.c1))
/// Unnest: lists[__unnest_placeholder(outer_ref(t1.c1))|depth=1] structs[]
/// Projection: outer_ref(t1.c1) AS __unnest_placeholder(outer_ref(t1.c1))
/// EmptyRelation
It's a column pointing to an alias of its child. The alias is built from an expression schema_name
.
datafusion/datafusion/sql/src/utils.rs
Line 400 in ede665b
let post_unnest_expr = col(post_unnest_name.clone()).alias(alias_name); |
I think there is a similar issue for #13824 (comment).
Maybe we can have a constant for the display result of UNNEST
.
Looks good 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goldmedal! This looks good to me, I like the enum return - makes it much clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @goldmedal and @sgrebnov and @phillipleblanc for the reviews
I plan to merge tomorrow unless someone beats me to it |
Thanks again @goldmedal @phillipleblanc and @sgrebnov |
Thanks @alamb @phillipleblanc @sgrebnov 👍 |
Which issue does this PR close?
Closes #13793
Rationale for this change
Support to unparse the implicit lateral unnest plan. Given a SQL
It can be planned to the logical plan and unparsed back to the SQL
If
with_unnest_as_table_factor
is enabled, it will be unparsed toWhat changes are included in this PR?
Are these changes tested?
yes
Are there any user-facing changes?
new unparsing behavior