-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Allow pl.col(pl.Enum)
for selecting all Enum columns
#13891
fix(python): Allow pl.col(pl.Enum)
for selecting all Enum columns
#13891
Conversation
5759d39
to
782dbcc
Compare
py-polars/src/conversion/mod.rs
Outdated
)) | ||
}, | ||
"Enum" => DataType::Categorical( | ||
Some(Arc::new(RevMapping::build_enum(Utf8ViewArray::new_empty( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be None
, not empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I would like to rework the type a bit to have a state of 'Non-Initalized Enum'. But preferably outside this PR, so this is Ok for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enum does not accept None
yet, I think this would be better in a separate PR
The problem of equality check is here. We need to distinguish Enum from Categorical. Right now, if you do |
@c-peters @ritchie46 #[cfg(feature = "dtype-categorical")]
(Categorical(rev_l, _), Categorical(rev_r, _)) => {
let is_l_enum = rev_l.as_ref().map_or(false, |x| x.is_enum());
let is_r_enum = rev_r.as_ref().map_or(false, |x| x.is_enum());
is_l_enum == is_r_enum
}, |
Yes, this is not ideal. I'm working on making Enums an acual datatype as to avoid this cumbersome rev_map check |
@collinprince , |
… works properly with empty enum
d65ddaf
to
373b7ee
Compare
should be good now @c-peters |
@@ -195,14 +195,6 @@ def test_extend_to_an_enum() -> None: | |||
assert s.null_count() == 1 | |||
|
|||
|
|||
def test_series_init_uninstantiated_enum() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test should still be valid right? We do not allow creating a series with an empty Enum
type, the None is just a placeholder for all Enums
@@ -402,3 +394,27 @@ def test_enum_cast_from_other_integer_dtype_oob() -> None: | |||
pl.ComputeError, match="conversion from `u64` to `u32` failed in column" | |||
): | |||
series.cast(enum_dtype) | |||
|
|||
|
|||
def test_enum_creating_col_expr() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken, this already runs on main without any of the other changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we should be able to convert the python class object to the rust datatype without hte need for None
in the constructor
This is supeseded by #14628. We do not allow empty |
Update
extract
to create an emptypl.Enum
so that column expressions can be extracted for thepl.Enum
datatype e.g.pl.col(pl.Enum)
.Also update the
Enum
constructor to allow/default to None for thecategories
param. This mirrors the logic that is used inextract
forpl.Enum
and operates as a convenient short-hand for the current supported logic of passing in an empty series.Fixes #13269