Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: single-element .dt.time() and .dt.date() should always preserve sortedness #13808

Merged
merged 4 commits into from
Jan 23, 2024

Conversation

MarcoGorelli
Copy link
Collaborator

@MarcoGorelli MarcoGorelli commented Jan 18, 2024

closes #13806

@MarcoGorelli MarcoGorelli changed the title this fixes the issue at lesat fix: single-element .dt.time() and .dt.date() should always preserve sortedness Jan 18, 2024
@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Jan 18, 2024
Comment on lines 84 to 85
TimeUnit::Nanoseconds => Ok((self.0.as_ref() % NS_IN_DAY)
.cast(&Int64)
.unwrap()
TimeUnit::Nanoseconds => Ok(self
.apply_values(|v| v % NS_IN_DAY)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shortens the code - but also ends up hitting

if self.length <= 1 {
self.set_sorted_flag(IsSorted::Ascending)
}

@orlp
Copy link
Collaborator

orlp commented Jan 18, 2024

Honestly? I'd say add an if statement to the set_sorted_flag function itself.

pub fn set_sorted_flag(&mut self, sorted: IsSorted) {
    if self.len() <= 1 {
        self.bit_settings.insert(Settings::SORTED_ASC);
        self.bit_settings.insert(Settings::SORTED_DSC);
        return;
    }
    
    // ...
}

Then we can benefit from this in many more places.

@MarcoGorelli
Copy link
Collaborator Author

thanks - even simpler, could be to just do the early return in is_sorted_flag?

assert pl.DataFrame({"x": [None]})["x"].flags["SORTED_ASC"] is False
assert pl.DataFrame({"x": [None] * 2})["x"].flags["SORTED_ASC"] is False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a single element would now always be sorted, so I'm adding an extra one to try to preserve the spirit of the test

@@ -189,6 +189,9 @@ impl Series {
}

pub fn is_sorted_flag(&self) -> IsSorted {
if self.len() <= 1 {
return IsSorted::Ascending;
Copy link
Member

@stinodego stinodego Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it is sorted both Ascending and Descending at the same time 🤔 it would be some refactoring to get that done though...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stinodego Yes I've brought up this limitation of the current sorted flag to Ritchie before. You can also run into this for larger arrays, [0, 0, 0, 0, 0] is sorted both ascending and descending.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we replace this from compute_len to here?

Copy link
Collaborator Author

@MarcoGorelli MarcoGorelli Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expressions don't necessarily go down that path, e.g. the original implementation of casting Datetime to Time didn't (see #13812), so this just seemed simpler

@ritchie46 ritchie46 merged commit 9cf6af0 into pola-rs:main Jan 23, 2024
22 checks passed
r-brink pushed a commit to r-brink/polars that referenced this pull request Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

single-element .dt.time() and .dt.date() should always preserve sortedness
4 participants