Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect sorted flag propagation when casting from numeric to string type #19424

Open
2 tasks done
wence- opened this issue Oct 24, 2024 · 0 comments
Open
2 tasks done
Labels
accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars

Comments

@wence-
Copy link
Collaborator

wence- commented Oct 24, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.DataFrame({"a": [10, 1, 2]})

df.select(
  pl.col("a").sort().cast(pl.String).sort().alias("sort_cast_sort"),
  pl.col("a").cast(pl.String()).sort().alias("cast_sort")
)

Produces:

shape: (3, 2)
┌────────────────┬───────────┐
│ sort_cast_sort ┆ cast_sort │
│ ---            ┆ ---       │
│ str            ┆ str       │
╞════════════════╪═══════════╡
│ 1              ┆ 1         │
│ 2              ┆ 10        │
│ 10             ┆ 2         │
└────────────────┴───────────┘

And indeed:

df.select(pl.col("a").sort().cast(pl.String)).flags
# {'a': {'SORTED_ASC': True, 'SORTED_DESC': False}}

Log output

No response

Issue description

Casting from numeric to string types shouldn't propagate sortedness metadata, since string types sort lexicographically, not numerically.

Expected behavior

sort-cast-sort should produce the same result as cast-sort.

Installed versions

--------Version info---------
Polars:              1.11.0
Index type:          UInt32
Platform:            Linux-6.8.0-47-generic-x86_64-with-glibc2.35
Python:              3.12.7 | packaged by conda-forge | (main, Oct  4 2024, 16:05:46) [GCC 13.3.0]
LTS CPU:             False

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          3.1.0
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               2024.10.0
gevent               <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         1.6.0
numpy                2.0.2
openpyxl             3.1.5
pandas               2.2.3
pyarrow              17.0.0
pydantic             2.9.2
pyiceberg            <not installed>
sqlalchemy           2.0.36
torch                2.4.1.post302
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@wence- wence- added bug Something isn't working python Related to Python Polars needs triage Awaiting prioritization by a maintainer labels Oct 24, 2024
@coastalwhite coastalwhite added accepted Ready for implementation P-medium Priority: medium and removed needs triage Awaiting prioritization by a maintainer labels Oct 24, 2024
@orlp orlp added P-high Priority: high and removed P-medium Priority: medium labels Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working P-high Priority: high python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

3 participants