Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LazyFrame should not have active bindings #1142

Open
eitsupi opened this issue Jun 11, 2024 · 4 comments
Open

LazyFrame should not have active bindings #1142

eitsupi opened this issue Jun 11, 2024 · 4 comments
Labels
bug Something isn't working
Milestone

Comments

@eitsupi
Copy link
Collaborator

eitsupi commented Jun 11, 2024

See pola-rs/polars#16328

@eitsupi eitsupi added the bug Something isn't working label Jun 11, 2024
@eitsupi
Copy link
Collaborator Author

eitsupi commented Jun 11, 2024

@etiennebacher Perhaps the inability to retrieve column listings at low cost would cause a serious performance penalty for tidypolars?

My understanding is that they changed it to allow us to create a LazyFrame with no data source on the local computer for their service (Polars cloud).
As a result, the LazyFrame does not get the existence of the columns until the collect() run.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jun 11, 2024

I think dplyr assumes that all column names are known.
So perhaps tidypolars may have to be updated to refuse to convert from polars lazyframe to tidypolars class, and, to get the schema when the tidypolars class is created and used thereafter.

@etiennebacher
Copy link
Collaborator

My understanding is that the cost of getting the schema has already increased and that they want to convert those active bindings to functions. I'll need to see the effects there are on tidypolars but I'm not very worried for now.

@eitsupi
Copy link
Collaborator Author

eitsupi commented Jun 13, 2024

Yes, I overestimated the cost of getting the schema.
Considering that it does not currently cost that much to scan parquet files, etc., the cost of acquiring the schema seems insignificant.

In any case, active bindings need to be removed (scans to paths that don't currently exist due to active bindings fails in R now).

@eitsupi eitsupi added this to the Rewrite milestone Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants