Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve coerce API so it does not need DFSchema #10331

Merged
merged 1 commit into from
May 2, 2024

Conversation

alamb
Copy link
Contributor

@alamb alamb commented May 1, 2024

Which issue does this PR close?

Closes #3793

Rationale for this change

  1. Permit making an API for Error "entered unreachable code: NamedStructField should be rewritten in OperatorToFunction" after upgrade to 37 #10181 that takes a &DFSchema rather than an owned
  2. Avoid a copy and thus make the API marginally faster in some cases.

What changes are included in this PR?

Change coerce API from

    pub fn coerce(&self, expr: Expr, schema: DFSchemaRef) -> Result<Expr> {

to

    pub fn coerce(&self, expr: Expr, schema: &DFSchema) -> Result<Expr> {

Are these changes tested?

By existing CI

Are there any user-facing changes?

The API to call coerce is slightly changed. Note I am in the process of adding a proper fully baked API in #10181

@alamb alamb added the api change Changes the API exposed to users of the crate label May 1, 2024
@alamb alamb marked this pull request as ready for review May 1, 2024 17:23
@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate labels May 1, 2024
@@ -258,7 +258,7 @@ pub fn physical_expr(schema: &Schema, expr: Expr) -> Result<Arc<dyn PhysicalExpr
ExprSimplifier::new(SimplifyContext::new(&props).with_schema(df_schema.clone()));

// apply type coercion here to ensure types match
let expr = simplifier.coerce(expr, df_schema.clone())?;
let expr = simplifier.coerce(expr, &df_schema)?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows the effect of this API change on users (add a &)

// it manually.
// https://github.com/apache/datafusion/issues/3793
pub fn coerce(&self, expr: Expr, schema: DFSchemaRef) -> Result<Expr> {
pub fn coerce(&self, expr: Expr, schema: &DFSchema) -> Result<Expr> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual change (I don't think making it take SimplifyInfo is practical given how much coerce does)

Copy link
Contributor

@jayzhan211 jayzhan211 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@alamb
Copy link
Contributor Author

alamb commented May 2, 2024

Thank you for the review @jayzhan211

@alamb alamb merged commit fa31c78 into apache:main May 2, 2024
24 checks passed
@alamb alamb deleted the alamb/cleanup_simplify_api branch May 2, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve coerce API so it doesn't need DF Schema
2 participants