-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setcolororder scrambling the dataset #6171
Comments
Could you provide a reproducible example? I'm unable to recreate what I understand the issue to be with version 1.15.4: library(data.table)
DT = data.table(
'abc'=letters,
'def'=LETTERS,
'ghi'=1L:26L
)
str(DT)
#> Classes 'data.table' and 'data.frame': 26 obs. of 3 variables:
#> $ abc: chr "a" "b" "c" "d" ...
#> $ def: chr "A" "B" "C" "D" ...
#> $ ghi: int 1 2 3 4 5 6 7 8 9 10 ...
#> - attr(*, ".internal.selfref")=<externalptr>
setcolorder(DT, c('def', 'ghi', 'abc'))
str(DT)
#> Classes 'data.table' and 'data.frame': 26 obs. of 3 variables:
#> $ def: chr "A" "B" "C" "D" ...
#> $ ghi: int 1 2 3 4 5 6 7 8 9 10 ...
#> $ abc: chr "a" "b" "c" "d" ...
#> - attr(*, ".internal.selfref")=<externalptr> Created on 2024-06-08 with reprex v2.1.0 Session infosessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.3 (2024-02-29)
#> os macOS Sonoma 14.5
#> system x86_64, darwin23.2.0
#> ui unknown
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2024-06-08
#> pandoc 3.2 @ /usr/local/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.3)
#> data.table * 1.15.4 2024-03-30 [1] CRAN (R 4.3.3)
#> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.3)
#> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.3)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.3)
#> fs 1.6.4 2024-04-25 [1] CRAN (R 4.3.3)
#> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.3)
#> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3)
#> knitr 1.46 2024-04-06 [1] CRAN (R 4.3.3)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.3)
#> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.3)
#> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.3)
#> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3)
#> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.3.3)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.3)
#> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.3)
#> xfun 0.43 2024-03-25 [1] CRAN (R 4.3.3)
#> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.3)
#>
#> [1] /usr/local/lib/R/4.3/site-library
#> [2] /usr/local/Cellar/r/4.3.3/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── |
ok that's strange. I can confirm that your example runs fine on my computer, and still I went back, double-checked my data, and can confirm that for my dataset setcolorder truly scrambles the columns...
Any idea on something else I can provide? Unfortunately I cannot move the data... |
|
This library(data.table)
DT = data.table(
'abc'=letters,
'def'=LETTERS,
'ghi'=1L:26L
)
str(DT)
#> Classes 'data.table' and 'data.frame': 26 obs. of 3 variables:
#> $ abc: chr "a" "b" "c" "d" ...
#> $ def: chr "A" "B" "C" "D" ...
#> $ ghi: int 1 2 3 4 5 6 7 8 9 10 ...
#> - attr(*, ".internal.selfref")=<externalptr>
setcolorder(DT, c('def', 'ghi', 'abc'))
str(DT)
#> Classes 'data.table' and 'data.frame': 26 obs. of 3 variables:
#> $ def: chr "A" "B" "C" "D" ...
#> $ ghi: int 1 2 3 4 5 6 7 8 9 10 ...
#> $ abc: chr "a" "b" "c" "d" ...
#> - attr(*, ".internal.selfref")=<externalptr> Created on 2024-06-08 with reprex v2.1.0 Session infosessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.3 (2024-02-29)
#> os macOS Sonoma 14.5
#> system x86_64, darwin23.2.0
#> ui unknown
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2024-06-08
#> pandoc 3.2 @ /usr/local/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.3)
#> data.table * 1.15.99 2024-06-08 [1] local
#> digest 0.6.35 2024-03-11 [1] CRAN (R 4.3.3)
#> evaluate 0.23 2023-11-01 [1] CRAN (R 4.3.3)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.3)
#> fs 1.6.4 2024-04-25 [1] CRAN (R 4.3.3)
#> glue 1.7.0 2024-01-09 [1] CRAN (R 4.3.3)
#> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.3)
#> knitr 1.46 2024-04-06 [1] CRAN (R 4.3.3)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.3)
#> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.3.3)
#> rlang 1.1.3 2024-01-10 [1] CRAN (R 4.3.3)
#> rmarkdown 2.26 2024-03-05 [1] CRAN (R 4.3.3)
#> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.3.3)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.3)
#> withr 3.0.0 2024-01-16 [1] CRAN (R 4.3.3)
#> xfun 0.43 2024-03-25 [1] CRAN (R 4.3.3)
#> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.3.3)
#>
#> [1] /usr/local/lib/R/4.3/site-library
#> [2] /usr/local/Cellar/r/4.3.3/lib/R/library
#>
#> ────────────────────────────────────────────────────────────────────────────── |
Setting |
original post wrote "after the last developer update" meaning github master? could be related to #6068? |
sorry for the annoying stuff, it's not simple to really reproduce it here (the dataset has a lot of variables).
So it must be something about the fact that I have many many columns, and some of them maybe create a problem. It is not obvious to me what that might be, I am experimenting a bit to see whether I figure it out. If I understand anything more I will let you know. Thanks! |
would be useful if you could create a data set with 1 row and many many columns that reproduces your issue. |
Yes, can you reproduce this issue by doing the following? # ... other code ...
dataset <- dataset[0]
setcolorder(dataset, ...) # the same setcolorder() call If so, hopefully you're comfortable sharing at least your column names. Another suggestion: anonymize the data like so: anonymized_data <- dataset |>
lapply(\(x) vector(typeof(x), length(x))) |>
setDT()
setcolorder(anonymized_dataset, ...) Some more care could be taken to reproduce common types like |
Hi @eacabbi any updates on this? I think Michael had a good suggestion for how to share a more reproducible example if it's possible to share very minimal information. |
A last suggestion is to please try re-starting and re-installing R. Possibly your installation got into a bad state. Please re-open if you're still hitting the issue & can provide some new info towards reproduction. |
Recently after the last developer update data.table started behaving strangely. Among the encountered issued, I witnessed in my code that setcolorder does not work any more as intended.
setcolorder(dataset,c("x","y","z"))
did not lead to reordering the column in the dataset in the order, but just to rename the inital columns of the datset without altering the content. This implied that a column which was second in the dataset before is after the command called "y" but does not has the right contents.
The text was updated successfully, but these errors were encountered: