-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite based on savvy and rlang #1152
Comments
Thanks for all the work, I'll take a look this weekend after #1147 |
Significant changes that may affect users are noted in the NEWS file. |
I pushed the neo-r-polars' main branch to this repository so that we can start the migration work. For now, the new |
Documentation for translation from Python Polars has been pushed. |
Thanks, that is very helpful. I noticed that the class names have changed, e.g from Similarly, the current |
Also, related to the post on Mastodon asking for help, I think it would be helpful to have a checklist (either at the top of this issue or in a new one) with more details on the progress of the rewrite. It doesn't have to be super detailed but simply have an idea of how far we are, where should a new contributor focus, etc. For instance something like this:
|
Thanks for taking a look at this. I will create the task list at another epic issue soon as I can, as it is definitely needed.
There are two reasons.
No, these are actually different things. That is, In Python: >>> import polars as pl
>>> pl.lit(1).__class__.__name__
'Expr'
>>> pl.lit(1)._pyexpr.__class__.__name__
'PyExpr' In R: > pl$lit(1) |> class()
[1] "polars_expr" "polars_object"
> pl$lit(1)$`_rexpr` |> class()
[1] "PlRExpr" Unlike the current |
I see, thanks for the explanations.
I don't think the style of class names is important. To me, the two main requirements is that they must be clear and unique, and I think Bottom line, I'd rather keep the current Sidenote: I didn't find conventions for class names (in general, I feel R code lacks conventions anyway). I also didn't find packages with CamelCase class names (can't say I searched a lot though) but some prominent packages do not use snake_case: class(Matrix::Matrix(0, 3, 2))
#> [1] "dgCMatrix"
#> attr(,"package")
#> [1] "Matrix" Packages using S4 (like those on BioConductor) also usually have camel case. |
I think the most commonly used style guides are the Tidyverse style guide and Google's Style guide (of course Google publishes Style guides for many languages and I believe they are very popular). https://style.tidyverse.org/syntax.html#object-names Both of these encourage the use of snake_case for class names. |
Continued from #942 and #1126
(The following text is copied from https://github.com/eitsupi/neo-r-polars's README)
Motivation
I have been developing r-polars for over a year, and I felt that a significant rewrite was necessary.
r-polars is a clone of py-polars,
but the package structure is currently quite different.
Therefore, it was difficult to keep up with frequent updates.
I thought that now, around the release of Python Polars 1.0.0, is a good time for a complete rewrite, so I decided to try it.
There are several reasons to rewrite r-polars on both the Rust and R sides.
Rust side
it is not possible to place multiple impl blocks.
(extendr/extendr#538)
There is a lot of custom code to use the Result type with extendr, which is quite different from other packages based on extendr.
(extendr/extendr#650)
The code is difficult to follow because it uses a macro called
robj_to
for type conversion (at least in rust-analyzer).About 1 and 2, I expect that switching from extendr to savvy
will improve the situation.
For 3, in py-polars and nodejs-polars, a thin
Wrap
struct wraps other types and processes them with standardFrom
traits etc.,which I think makes the code cleaner.
R side
In py-polars, the strategy is that classes defined on the Rust side (e.g.,
PyDataFrame
) are wrapped by classes defined on the Python side (e.g.,DataFrame
).In r-polars, a complex strategy is adopted to update classes created by Rust side/extendr (e.g.,
RPolarsDataFrame
) with a lot of custom code.(This is also related to the fact that extendr makes associated functions of Rust structs members of R classes. savvy does not mix associated functions and methods.)
This is also related to the Rust side, in the current r-polars, generic functions like
as_polars_series
were added later,so there are several places where type conversion from R to Polars is done on the Rust side, making it difficult to understand where the type conversion is done.
If type conversion from R to Polars is done with two generic functions,
as_polars_series
andas_polars_expr
, the code will be much simpler and customization from the R side will be possible.Currently, r-polars has its own Result type on the R side, and error handling is done through it.
The backtrace generated that is quite easy to understand, but it is not necessarily easy to use when using polars internally in other packages, such as
testthat::expect_error()
.rlang
.Currently, r-polars has no R package dependencies. This is great,
but that includes a degraded copy of
list2()
instead of the convenient functions in the
rlang
package.rlang
is a lightweight R package, and I feel that it is more beneficial to depend on the convenient functions ofrlang
than to stick to no dependencies.1 and 3 are also related to the fact that it is built with extendr, and it seems that switching to savvy is appropriate here as well.
If we abandon the current Result type on the R side, it is natural to use
rlang
for error handling, so from that perspective, it is reasonable to depend onrlang
in 4.The text was updated successfully, but these errors were encountered: