You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@etiennebacher Perhaps the inability to retrieve column listings at low cost would cause a serious performance penalty for tidypolars?
My understanding is that they changed it to allow us to create a LazyFrame with no data source on the local computer for their service (Polars cloud).
As a result, the LazyFrame does not get the existence of the columns until the collect() run.
I think dplyr assumes that all column names are known.
So perhaps tidypolars may have to be updated to refuse to convert from polars lazyframe to tidypolars class, and, to get the schema when the tidypolars class is created and used thereafter.
My understanding is that the cost of getting the schema has already increased and that they want to convert those active bindings to functions. I'll need to see the effects there are on tidypolars but I'm not very worried for now.
Yes, I overestimated the cost of getting the schema.
Considering that it does not currently cost that much to scan parquet files, etc., the cost of acquiring the schema seems insignificant.
In any case, active bindings need to be removed (scans to paths that don't currently exist due to active bindings fails in R now).
See pola-rs/polars#16328
The text was updated successfully, but these errors were encountered: