Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a base R check_empty_dots() #1144

Closed
wants to merge 1 commit into from
Closed

Conversation

etiennebacher
Copy link
Collaborator

Related to #942

It seems unnecessary to me to import rlang to check for empty dots. Also rlang::check_empty_dots() is very expensive on my machine, about 200ms, but I didn't find why (if it's due to rlang update, RStudio, etc.). I'm not completely against having rlang but if we can have easy base R alternatives I don't think it's worth importing it.

This PR is just an example of what it would look like, for example with $top_k():

pl$DataFrame(a = c(6, 1, 0, NA, Inf, NaN))$select(pl$col("a")$top_k(5, nulls_last = TRUE, FALSE, 1))
Error: Execution halted with the following contexts
   0: In R: in $select()
   0: During function call [pl$DataFrame(a = c(6, 1, 0, NA, Inf, NaN))$select(pl$col("a")$top_k(5, 
          nulls_last = TRUE, FALSE, 1))]
   1: `...` must be empty. Problematic values: FALSE, 1

@eitsupi
Copy link
Collaborator

eitsupi commented Jun 14, 2024

You may want to see check_dots_empty0.

we can have easy base R alternatives I don't think it's worth importing it.

Your implementation does not handle named arguments properly.

I really dislike having to reinvent the wheel like this.
While rlang installation is fairly inexpensive, rlang contains C code and is not easy to copy.

Of course, it is easy to just check ... is empty, but it is really hard to generate the proper error messages, and I don't see the value in not importing rlang without giving up that advantage.

@etiennebacher
Copy link
Collaborator Author

etiennebacher commented Jun 14, 2024

The main issue for now is that I have perf issues with rlang, even with check_dots_empty0(): r-lib/rlang#1718

This is rare, but it's the kind of thing we can avoid by having small replacements for those functions. Again I'm not completely opposed to importing rlang but I'd like to know what are the features we're going to use before importing it. We have a mechanism to report Rust errors that works and any refactor is quite tedious considering the size of the code base.

Your implementation does not handle named arguments properly.

Not sure I follow, do you have an example? Below it seems that my implementation and rlang work or error on the same cases.

library(rlang)
options(polars.do_not_repeat_call = TRUE)

my_check_dots_empty <- function(...) {
  dots <- eval(substitute(alist(...)))
  if (length(dots) > 0) {
    polars:::Err_plain("`...` must be empty. Problematic values:", toString(dots)) |>
      polars:::unwrap()
  }
}

f_rlang <- function(..., xyz = 1) {
  check_dots_empty0(...)
  xyz
}
f_base <- function(..., xyz = 1) {
  my_check_dots_empty(...)
  xyz
}


f_rlang(xyz = 1)
#> [1] 1
f_base(xyz = 1)
#> [1] 1
f_rlang(xy = 1)
#> Error in `f_rlang()`:
#> ! `...` must be empty.
#> ✖ Problematic argument:
#> • xy = 1
f_base(xy = 1)
#> Error: Execution halted with the following contexts
#>    1: `...` must be empty. Problematic values: 1
f_rlang(xyz = 1, b =)
#> Error in `f_rlang()`:
#> ! `...` must be empty.
#> ✖ Problematic argument:
#> • b = <empty>
f_base(xyz = 1, b =)
#> Error: Execution halted with the following contexts
#>    1: `...` must be empty. Problematic values:

@eitsupi
Copy link
Collaborator

eitsupi commented Jun 14, 2024

Your implementation does not handle named arguments properly.

Not sure I follow, do you have an example? Below it seems that my implementation and rlang work or error on the same cases.

Sorry for my lack of explanation. I meant to point out that the argument names are not included in the error message.

@eitsupi
Copy link
Collaborator

eitsupi commented Jun 15, 2024

The main issue for now is that I have perf issues with rlang, even with check_dots_empty0(): r-lib/rlang#1718

I do not understand the importance of speed when errors are expected to occur.
I saw that issue, but rlang is faster if there are no errors, right?

@etiennebacher
Copy link
Collaborator Author

etiennebacher commented Jun 15, 2024

I do not understand the importance of speed when errors are expected to occur.

True, it's not crucial (except if one plays a lot with try()).

Just wondering: supposing we move to savvy (#1126), would it make sense to refactor the way errors are structured based on how savvy handles them? If so, maybe at that point importing rlang can be also useful for clearer error messages.

To clarify, I find the current error messages okay in the sense that usually all the information is there, but they are quite dense and different from more standard R messages. Having something more user friendly if we're going to refactor the internals anyway would be nice.


Closing this since this was just a test and we can still discuss.

@eitsupi
Copy link
Collaborator

eitsupi commented Jun 15, 2024

Just wondering: supposing we move to savvy (#1126), would it make sense to refactor the way errors are structured based on how savvy handles them? If so, maybe at that point importing rlang can be also useful for clearer error messages.

As I mentioned in #1126, I think you would benefit from using rlang because the savvy error messages are from standard R (similar to the tidyverse package on top of the cpp11 package).

See also the following issues for background on why r-polars needed to implement its own error handling.
extendr/extendr#278
PRQL/prqlc-r#46 (As for prqlr, we removed the code for error handling when we switched to savvy to support webR. But since polars can't be WASM compiled, it's irrelevant here.)

By the way, did you have any problems with tidypolars with a mix of rlang errors and errors coming from r-polars?

@etiennebacher
Copy link
Collaborator Author

By the way, did you have any problems with tidypolars with a mix of rlang errors and errors coming from r-polars?

I didn't have any technical problems but the style of errors clearly depends on whether it comes from tidypolars (e.g using a function that isn't translated) or from polars (e.g passing a float column to a string function):

library(dplyr, warn.conflicts = FALSE)
library(tidypolars)
options(polars.do_not_repeat_call = TRUE)

test <- as_polars_df(iris)

### tidypolars doesn't know the function
test |> 
  mutate(x = data.table::chorder(Species))
#> Error in `mutate()`:
#> ! `tidypolars` doesn't know how to translate this function: `data.table::chorder()`
### tidypolars knows the function but error from polars because wrong input
test |> 
  mutate(x = stringr::str_count(Sepal.Length))
#> Error: Execution halted with the following contexts
#>    0: In R: in $with_columns()
#>    1: Encountered the following error in Rust-Polars:
#>          invalid series dtype: expected `String`, got `f64`

@etiennebacher etiennebacher deleted the check-dots-empty branch June 16, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants