Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Change mean/median of Date columns to return a Datetime instead of Date #13492

Closed
wants to merge 12 commits into from

Conversation

mcrumiller
Copy link
Contributor

@mcrumiller mcrumiller commented Jan 6, 2024

Resolves #13598.

Ref #10232

Part 2 of a series of datetime fixes: implement mean/median for date and return a datetime.

This treats dates and datetimes as analagous to integers/floats: the mean of ints is a float, and now the mean of dates is a datetime:

mean(1, 2) = 1.5
mean(2023-01-01, 2023-01-02) = 2023-01-01 12:00.

Changes:

  • return datetime object from mean/median
  • mean/median in expr/lit context has pl.Datetime dtype
  • .dt.mean() is no longer required for datetime namespace (i.e. has same functionality as s.mean/median).
  • group_by(...).mean() now returns pl.Datetime.

Upcoming related PRs:

Part 3 - mean/median for pl.Duration
Part 4 - mean/median for pl.Time
Part 5 - deprecate mean/median for datetime namespace.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jan 6, 2024
@mcrumiller
Copy link
Contributor Author

@MarcoGorelli mind adding this one to your to-do so I can continue on with the others? This is probably the worst one since I had to update the streaming code a bit to make the date->datetime conversion work in streaming for mean/median.

@stinodego
Copy link
Member

I definitely support changing mean and median to return a datetime when appropriate. Is there an issue for this?

@mcrumiller
Copy link
Contributor Author

@stinodego it was originally issue #10232 with PR #12902 but it was requested to split into multiple PRs. I can make a separate issue for each PR, including this one.

@stinodego
Copy link
Member

Thanks - no need to make separate issues I think.

@mcrumiller
Copy link
Contributor Author

@stinodego oops just made one. It's a bit clearer on the issue at least.

@mcrumiller
Copy link
Contributor Author

@MarcoGorelli and @reswqa I think I've addressed your issues. Thanks for the review.

@mcrumiller
Copy link
Contributor Author

@MarcoGorelli and @reswqa I don't want this to get too stale, as I have a few other PRs in the pipeline that are dependent on this one. Do you have any other suggested changes? Thank you!

@reswqa
Copy link
Collaborator

reswqa commented Jan 17, 2024

Do you have any other suggested changes?

No, I don't. It looks good from my side. But as I am not familiar with temporal thing, I leave the final check before merge to others.

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me

The parts I don't want to formally approve myself are:

  • crates/polars-pipe/src/executors/sinks/group_by/aggregates/mean.rs
  • crates/polars-plan/src/logical_plan/aexpr/schema.rs

but it looks like Weijie is happy with them

@mcrumiller
Copy link
Contributor Author

Hi @stinodego or @ritchie46 can we merge this so I can proceed with the next temporal updates?

@mcrumiller
Copy link
Contributor Author

@ritchie46 I hate to harp on this, but it's been sitting in the queue for a while and it's fairly straightforward. Any issues with this PR? I'd like to continue on the other temporal types.

@mcrumiller
Copy link
Contributor Author

@stinodego I got a streaming error on here, is this related to the markers again?

@mcrumiller mcrumiller marked this pull request as draft February 13, 2024 19:19
@stinodego stinodego changed the title feat(rust, python): return datetime for date mean & median feat(rust, python): Change mean/median of Date columns to return a Datetime instead of Date Feb 15, 2024
@stinodego stinodego changed the title feat(rust, python): Change mean/median of Date columns to return a Datetime instead of Date feat: Change mean/median of Date columns to return a Datetime instead of Date Feb 15, 2024
@mcrumiller mcrumiller closed this Apr 16, 2024
@mcrumiller mcrumiller deleted the date-mean branch June 6, 2024 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

mean and median for pl.Date should return a datetime
4 participants