Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame parse_sql_expr does not handle aliases #12518

Open
milenkovicm opened this issue Sep 18, 2024 · 12 comments
Open

DataFrame parse_sql_expr does not handle aliases #12518

milenkovicm opened this issue Sep 18, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@milenkovicm
Copy link
Contributor

Describe the bug

df.parse_sql_expr does not handle expression aliases, for expressions like

df.parse_sql_expr("col0 as new_col0")

it will return col(col0`) expression

To Reproduce

statement

df.parse_sql_expr("SUM(int_col) as sum_int_col")

from

"| double_col | sum(?table?.int_col) |",

can be used to demonstrate this issue

as return column has sum(?table?.int_col) as the name

Expected behavior

it is expected to have expression alias handled

Additional context

No response

@milenkovicm milenkovicm added the bug Something isn't working label Sep 18, 2024
@Eason0729
Copy link
Contributor

I spent some time on tracing code but found unable to fix it.
Maybe we need to change some code in sqlparser, correct me if I am wrong.

Detail To be specific, I run following with debugger attached
let df = ctx.sql("SELECT 1 as col0").await?;
let expr = df.parse_sql_expr("1 + 3 * col0 as new_col0")?;

And found this line, whose value return from sqlparser to be following.

It doesn't contain new_col0.

BinaryOp { left: Value(Number("1", false)), op: Plus, right: BinaryOp { left: Value(Number("3", false)), op: Multiply, right: Identifier(Ident { value: "col0", quote_style: None }) } }

@Eason0729
Copy link
Contributor

@milenkovicm
Copy link
Contributor Author

I think you're right @Eason0729 I did not have much time to investigate but initial investigation lead to sqlparser

@Eason0729
Copy link
Contributor

There is a private method parse_expr_with_alias in sqlparser, which returns ExprWithAlias.

I would propose changing the return type from sqlparser::ast::Expr on call path to sqlparser::ast::ExprWithAlias, use enum datafusion_expr::expr::Expr::Alias in return type.

Writing a example change might be better if more description in needed

This could cause other method to change(including some public methods on SessionState).

If above is okay, I would begin working on it.

@milenkovicm
Copy link
Contributor Author

parse_sql_expr returns expression, alias is an expression, why would you need to change public method in SessionState?

@Eason0729
Copy link
Contributor

Mainly because it could be done without changing sqlparser.

I think I should start writing code right away, as it would be better to communicate with code.

I would do it in my fork and let you know when I am ready, then I might learn how to do it without changing public method in SessionState.

@Eason0729
Copy link
Contributor

I just finish it in my fork.

It's not meant to be a pull request, just to showcase what I would like to change.

Here is the public API I change: diff

Also, I change sqlparser of my fork to make Parser::parse_expr_with_alias public: Eason0729/sqlparser-rs@c6360b1

@milenkovicm
Copy link
Contributor Author

It does look good, @Eason0729.
maybe @alamb and @comphead can give you better advice than me when they get some time

@Eason0729
Copy link
Contributor

Okay, then I will remove thing that doesn't belong to pull request and submit pull request.

@alamb
Copy link
Contributor

alamb commented Sep 26, 2024

PR in sqlparser: apache/datafusion-sqlparser-rs#1444

@Eason0729
Copy link
Contributor

Let's wait for sqlparser-rs/sqlparser-rs#1444, then I will submit a PR for datafusion.

@alamb
Copy link
Contributor

alamb commented Sep 28, 2024

Thank you @Eason0729 -- we'll try and release a sqlparser release shortly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants