Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add skip_absent to setcolorder #6044

Merged
merged 24 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@

10. `measure` now supports user-specified `cols` argument, which can be useful to specify a subset of columns to `melt`, without having to use a regex, [#5063](https://github.com/Rdatatable/data.table/issues/5063). Thanks to @UweBlock and @Henrik-P for reporting, and @tdhock for the PR.

11. `setcolorder()` gains `skip_absent` to drop columns that aren't present, [#6044, #6068](https://github.com/Rdatatable/data.table/pull/6044). Default behavior (`skip_absent=FALSE`) remains unchanged, i.e. unrecognized columns result in an error. Thanks to @sluga for the suggestion and @sluga & @Nj221102 for the PRs.

## BUG FIXES

1. `unique()` returns a copy the case when `nrows(x) <= 1` instead of a mutable alias, [#5932](https://github.com/Rdatatable/data.table/pull/5932). This is consistent with existing `unique()` behavior when the input has no duplicates but more than one row. Thanks to @brookslogan for the report and @dshemetov for the fix.
Expand Down
5 changes: 3 additions & 2 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -2684,15 +2684,16 @@ setnames = function(x,old,new,skip_absent=FALSE) {
invisible(x)
}

setcolorder = function(x, neworder=key(x), before=NULL, after=NULL) # before/after #4358
setcolorder = function(x, neworder=key(x), before=NULL, after=NULL,skip_absent=FALSE) # before/after #4358
{
if (is.character(neworder) && anyDuplicated(names(x)))
stopf("x has some duplicated column name(s): %s. Please remove or rename the duplicate(s) and try again.", brackify(unique(names(x)[duplicated(names(x))])))
if (!is.null(before) && !is.null(after))
stopf("Provide either before= or after= but not both")
if (length(before)>1L || length(after)>1L)
stopf("before=/after= accept a single column name or number, not more than one")
neworder = colnamesInt(x, neworder, check_dups=FALSE) # dups are now checked inside Csetcolorder below
neworder = colnamesInt(x, neworder, check_dups=FALSE, skip_absent=skip_absent) # dups are now checked inside Csetcolorder below
neworder = neworder[neworder != 0] # tests 498.11, 498.13 fail w/o this
if (length(before))
neworder = c(setdiff(seq_len(colnamesInt(x, before) - 1L), neworder), neworder)
if (length(after))
Expand Down
9 changes: 9 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -1585,6 +1585,15 @@ test(498.03, setcolorder(DT, 1, after=3), data.table(b=2, c=3, a=1))
test(498.04, setcolorder(DT, 3, before=1), data.table(a=1, b=2, c=3))
test(498.05, setcolorder(DT, 1, before=1, after=1), error="Provide either before= or after= but not both")
test(498.06, setcolorder(DT, 1, before=1:2), error="before=/after= accept a single column name or number, not more than one")
# skip_absent
test(498.07, setcolorder(DT, skip_absent='TRUE'), error='TRUE or FALSE')
test(498.08, setcolorder(DT, skip_absent= 1), error='TRUE or FALSE')
test(498.09, setcolorder(DT, skip_absent= c(TRUE, FALSE)), error='TRUE or FALSE')
test(498.10, setcolorder(DT, c('d', 'c', 'b', 'a')), error='non-existing column')
test(498.11, setcolorder(DT, c('d', 'c', 'b', 'a'), skip_absent=TRUE), data.table(c=3, b=2, a=1))
test(498.12, setcolorder(DT, 4:1), error='non-existing column')
test(498.13, setcolorder(DT, 4:1, skip_absent=TRUE), data.table(a=1, b=2, c=3))
test(498.14, setcolorder(DT, c(1, 1, 2, 3), skip_absent=TRUE), error='!=')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a test for setcolorder(DT, c('a', 'b', 'd'), skip_absent=TRUE).

The NEWS item currently reads like 'c' will be dropped from the output, which would be bad -- I would expect the above to be equivalent to setcolorder(DT, c("a", "b")).


# test first group listens to nomatch when j uses join inherited scope.
x <- data.table(x=c(1,3,8),x1=10:12, key="x")
Expand Down
3 changes: 2 additions & 1 deletion man/setcolorder.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,13 @@
}

\usage{
setcolorder(x, neworder=key(x), before=NULL, after=NULL)
setcolorder(x, neworder=key(x), before=NULL, after=NULL,skip_absent=FALSE)
}
\arguments{
\item{x}{ A \code{data.table}. }
\item{neworder}{ Character vector of the new column name ordering. May also be column numbers. If \code{length(neworder) < length(x)}, the specified columns are moved in order to the "front" of \code{x}. By default, \code{setcolorder} without a specified \code{neworder} moves the key columns in order to the "front" of \code{x}. }
\item{before, after}{ If one of them (not both) was provided with a column name or number, \code{neworder} will be inserted before or after that column. }
\item{skip_absent}{ Logical, default \code{FALSE}. If \code{TRUE}, no error is thrown if \code{neworder} includes columns not present in \code{x}, which are silently dropped. }
}
\details{
To reorder \code{data.table} columns, the idiomatic way is to use \code{setcolorder(x, neworder)}, instead of doing \code{x <- x[, ..neworder]} (or \code{x <- x[, neworder, with=FALSE]}). This is because the latter makes an entire copy of the \code{data.table}, which maybe unnecessary in most situations. \code{setcolorder} also allows column numbers instead of names for \code{neworder} argument, although we recommend using names as a good programming practice.
Expand Down