We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross listing this issue to arrow and dbplyr as recommended by @eitsupi
arrow
dbplyr
The column selection with select() in combination with tidyselect generates a bug when multiple arguments are passed to select().
select()
tidyselect
Reprex:
library(arrow) library(tidyverse) iris_arrow <- as_arrow_table(iris) iris_arrow |> select(!ends_with(".Length"), !Sepal.Width) |> names() [1] "Sepal.Width" "Petal.Width" "Species" "Sepal.Length" "Petal.Length"
In this example, even though Sepal.Width is supposed to be excluded, it still appears in the result.
A current workaround is to split select() function calls with pipe:
iris_arrow |> select(!ends_with(".Length")) |> select(!Sepal.Width) |> names() [1] "Petal.Width" "Species"
The same occurs when data is transferred to duckdb:
duckdb
iris_db <- iris_arrow |> to_duckdb() iris_db |> select(!ends_with(".Length"), !Sepal.Width) |> colnames() # why names() doesn't work? [1] "Sepal.Width" "Petal.Width" "Species" "Sepal.Length" "Petal.Length"
SessionInfo:
sessionInfo() R version 4.4.1 (2024-06-14) Platform: aarch64-apple-darwin20 Running under: macOS Sonoma 14.6.1 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: America/New_York tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 [8] tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0 arrow_16.1.0 loaded via a namespace (and not attached): [1] bit_4.0.5 gtable_0.3.5 compiler_4.4.1 tidyselect_1.2.1 assertthat_0.2.1 scales_1.3.0 [7] R6_2.5.1 generics_0.1.3 munsell_0.5.1 DBI_1.2.3 pillar_1.9.0 tzdb_0.4.0 [13] rlang_1.1.4 utf8_1.2.4 stringi_1.8.4 bit64_4.0.5 timechange_0.3.0 cli_3.6.3 [19] withr_3.0.1 magrittr_2.0.3 grid_4.4.1 dbplyr_2.5.0 hms_1.1.3 lifecycle_1.0.4 [25] vctrs_0.6.5 glue_1.7.0 data.table_1.15.4 duckdb_1.0.0-2 fansi_1.0.6 colorspace_2.1-1 [31] tools_4.4.1 pkgconfig_2.0.3
R
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Cross listing this issue to
arrow
anddbplyr
as recommended by @eitsupiThe column selection with
select()
in combination withtidyselect
generates a bug when multiple arguments are passed to select().Reprex:
In this example, even though Sepal.Width is supposed to be excluded, it still appears in the result.
A current workaround is to split
select()
function calls with pipe:The same occurs when data is transferred to
duckdb
:SessionInfo:
Component(s)
R
The text was updated successfully, but these errors were encountered: