diff --git a/NAMESPACE b/NAMESPACE index e754bc4a..bb906a65 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -60,6 +60,7 @@ export(imap_dfr) export(imap_int) export(imap_lgl) export(imap_raw) +export(imap_vec) export(imodify) export(insistently) export(invoke) diff --git a/NEWS.md b/NEWS.md index 6b952598..a9a63ef2 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,12 +1,16 @@ # purrr (development version) * Added a new keep_empty argument to `list_c()`, `list_cbind()`, and `list_rbind()` which will keep empty elements as `NA` in the returned data frame (@SokolovAnatoliy, #1096). +* `list_transpose()` now works with data.frames (@KimLopezGuell, #1109). +* Added `imap_vec()` (#1084) +* `list_transpose()` inspects all elements to determine the correct + template if it's not provided by the user (#1128, @krlmlr). # purrr 1.0.2 * Fixed valgrind issue. -* Deprecation infrastructure in `map_chr()` now has much less overhead +* Deprecation infrastructure in `map_chr()` now has much less overhead leading to improved performance (#1089). * purrr now requires R 3.5.0. @@ -14,11 +18,11 @@ # purrr 1.0.1 * As of purrr 1.0.0, the `map()` family of functions wraps all errors generated - by `.f` inside an wrapper error that tracks the iteration index. As of purrr - 1.0.1, this error now has a custom class (`purrr_error_indexed`), + by `.f` inside an wrapper error that tracks the iteration index. As of purrr + 1.0.1, this error now has a custom class (`purrr_error_indexed`), `location` and `name` fields, and is documented in `?purrr_error_indexed` (#1027). - + * `map()` errors with named inputs also report the name of the element that errored. @@ -41,19 +45,19 @@ See #768 for more information. * `update_list()` (#858) and `rerun()` (#877), and the use of tidyselect - with `map_at()` and friends (#874) have been deprecated. These functions - use some form of non-standard evaluation which we now believe is a poor + with `map_at()` and friends (#874) have been deprecated. These functions + use some form of non-standard evaluation which we now believe is a poor fit for purrr. * The `lift_*` family of functions has been deprecated. We no longer believe - these to be a good fit for purrr because they rely on a style of function + these to be a good fit for purrr because they rely on a style of function manipulation that is very uncommon in R code (#871). -* `prepend()`, `rdunif()`, `rbernoulli()`, `when()`, and `list_along()` have +* `prepend()`, `rdunif()`, `rbernoulli()`, `when()`, and `list_along()` have all been deprecated (#925). It's now clear that they don't align with the core purpose of purrr. -* `splice()` is deprecated because we no longer believe that automatic +* `splice()` is deprecated because we no longer believe that automatic splicing makes for good UI. Instead use `list2()` + `!!!` or `list_flatten()` (#869). @@ -62,33 +66,33 @@ * Use of map functions with expressions, calls, and pairlists has been deprecated (#961). -* All map `_raw()` variants have been deprecated because they are of limited +* All map `_raw()` variants have been deprecated because they are of limited use and you can now use `map_vec()` instead (#903). * In `map_chr()`, automatic conversion from logical, integer, and double to - character is now deprecated. Use an explicit `as.character()` if needed + character is now deprecated. Use an explicit `as.character()` if needed (#904). -* Errors from `.f` are now wrapped in an additional class that gives +* Errors from `.f` are now wrapped in an additional class that gives information about where the error occurred (#945). ### Deprecation next steps -* `as_function()` and the `...f` argument to `partial()` are no longer +* `as_function()` and the `...f` argument to `partial()` are no longer supported. They have been defunct for quite some time. * Soft deprecated functions: `%@%`, `reduce_right()`, `reduce2_right()`, - `accumulate_right()` are now fully deprecated. Similarly, the + `accumulate_right()` are now fully deprecated. Similarly, the `.lazy`, `.env`, and `.first` arguments to `partial()`, - and the `.right` argument to `detect()` and `detect_index()` + and the `.right` argument to `detect()` and `detect_index()` are fully deprecated. Removing elements with `NULL` in `list_modify()` and `list_merge()` is now fully deprecated. * `is_numeric()` and `is_scalar_numeric()` have been removed. They have been deprecated since purrr 0.2.3 (Sep 2017). -* `invoke_*()` is now deprecated. It was superseded in 0.3.0 (Jan 2019) and - 3.5 years later, we have decided to deprecate it as part of the API +* `invoke_*()` is now deprecated. It was superseded in 0.3.0 (Jan 2019) and + 3.5 years later, we have decided to deprecate it as part of the API refinement in the 1.0.0 release. * `map_call()` has been removed. It was made defunct in 0.3.0 (Jan 2019). @@ -116,8 +120,8 @@ (#894). * purrr now uses the base pipe (`|>`) and anonymous function short hand (`\(x)`), - in all examples. This means that examples will no longer work in R 4.0 and - earlier so in those versions of R, the examples are automatically converted + in all examples. This means that examples will no longer work in R 4.0 and + earlier so in those versions of R, the examples are automatically converted to a regular section with a note that they might not work (#936). * When map functions fail, they now report the element they failed at (#945). @@ -135,17 +139,17 @@ * New `list_transpose()` which automatically simplifies if possible (#875). * `accumulate()` and `accumulate2()` now both simplify the output if possible - using vctrs. New arguments `simplify` and `ptype` allow you to control the + using vctrs. New arguments `simplify` and `ptype` allow you to control the details of simplification (#774, #809). -* `flatten()` and friends are superseded in favour of `list_flatten()`, +* `flatten()` and friends are superseded in favour of `list_flatten()`, `list_c()`, `list_cbind()`, and `list_rbind()`. -* `*_dfc()` and `*_dfr()` have been superseded in favour of using the +* `*_dfc()` and `*_dfr()` have been superseded in favour of using the appropriate map function along with `list_rbind()` or `list_cbind()` (#912). * `simplify()`, `simplify_all()`, and `as_vector()` have been superseded in - favour of `list_simplify()`. It provides a more consistent definition of + favour of `list_simplify()`. It provides a more consistent definition of simplification (#900). * `transpose()` has been superseded in favour of `list_transpose()` (#875). @@ -155,16 +159,16 @@ * `_lgl()`, `_int()`, `_int()`, and `_dbl()` now use the same (strict) coercion methods as vctrs (#904). This means that: - - * `map_chr(TRUE, identity)`, `map_chr(0L, identity)`, and - `map_chr(1L, identity)` are deprecated because we now believe that - converting a logical/integer/double to a character vector should require + + * `map_chr(TRUE, identity)`, `map_chr(0L, identity)`, and + `map_chr(1L, identity)` are deprecated because we now believe that + converting a logical/integer/double to a character vector should require an explicit coercion. - - * `map_int(1.5, identity)` now fails because we believe that silently - truncating doubles to integers is dangerous. But note that + + * `map_int(1.5, identity)` now fails because we believe that silently + truncating doubles to integers is dangerous. But note that `map_int(1, identity)` still works since no numeric precision is lost. - + * `map_int(c(TRUE, FALSE), identity)`, `map_dbl(c(TRUE, FALSE), identity)`, `map_lgl(c(1L, 0L), identity)` and `map_lgl(c(1, 0), identity)` now succeed because 1/TRUE and 0/FALSE should be interchangeable. @@ -185,7 +189,7 @@ * `vec_depth()` is now `pluck_depth()` and works with more types of input (#818). -* `pluck()` now requires indices to be length 1 (#813). It also now reports +* `pluck()` now requires indices to be length 1 (#813). It also now reports the correct type if you supply an unexpected index. * `pluck()` now accepts negative integers, indexing from the right (#603). @@ -213,7 +217,7 @@ * New `list_assign()` which is similar to `list_modify()` but doesn't work recursively (#822). -* `list_modify()` no longer recurses into data frames (and other objects built +* `list_modify()` no longer recurses into data frames (and other objects built on top of lists that are fundamentally non-list like) (#810). You can revert to the previous behaviour by setting `.is_node = is.list`. @@ -227,8 +231,8 @@ * `modify_depth()` is no longer a generic. This makes it more consistent with `map_depth()`. -* `map_depth()` and `modify_depth()` have a new `is_node` argument that - allows you to control what counts as a level. The default uses +* `map_depth()` and `modify_depth()` have a new `is_node` argument that + allows you to control what counts as a level. The default uses `vec_is_list()` to avoid recursing into rich S3 objects like linear models or data.frames (#958, #920). @@ -239,9 +243,9 @@ * `possibly()` now defaults `otherwise` to NULL. -* `modify_if(.else)` is now actually evaluated for atomic vectors (@mgirlich, +* `modify_if(.else)` is now actually evaluated for atomic vectors (@mgirlich, #701). - + * `lmap_if()` correctly handles `.else` functions (#847). * `every()` now correctly propagates missing values using the same diff --git a/R/imap.R b/R/imap.R index fa1d9e52..ec1b01ff 100644 --- a/R/imap.R +++ b/R/imap.R @@ -56,6 +56,13 @@ imap_dbl <- function(.x, .f, ...) { map2_dbl(.x, vec_index(.x), .f, ...) } +#' @rdname imap +#' @export +imap_vec <- function(.x, .f, ...) { + .f <- as_mapper(.f, ...) + map2_vec(.x, vec_index(.x), .f, ...) +} + #' @export #' @rdname imap diff --git a/R/list-simplify.R b/R/list-simplify.R index da6fa165..ab5f82d1 100644 --- a/R/list-simplify.R +++ b/R/list-simplify.R @@ -7,9 +7,9 @@ #' [list_flatten()]. #' #' @param x A list. -#' @param strict What should happen if simplification fails? If `TRUE`, -#' it will error. If `FALSE` and `ptype` is not supplied, it will return `x` -#' unchanged. +#' @param strict What should happen if simplification fails? If `TRUE` +#' (the default) it will error. If `FALSE` and `ptype` is not supplied, +#' it will return `x` unchanged. #' @param ptype An optional prototype to ensure that the output type is always #' the same. #' @inheritParams rlang::args_dots_empty diff --git a/R/list-transpose.R b/R/list-transpose.R index 136d60c3..a3dba5b2 100644 --- a/R/list-transpose.R +++ b/R/list-transpose.R @@ -14,9 +14,9 @@ #' @param x A list of vectors to transpose. #' @param template A "template" that describes the output list. Can either be #' a character vector (where elements are extracted by name), or an integer -#' vector (where elements are extracted by position). Defaults to the names -#' of the first element of `x`, or if they're not present, the integer -#' indices. +#' vector (where elements are extracted by position). Defaults to the union +#' of the names of the elements of `x`, or if they're not present, the +#' union of the integer indices. #' @param simplify Should the result be [simplified][list_simplify]? #' * `TRUE`: simplify or die trying. #' * `NA`: simplify if possible. @@ -69,13 +69,25 @@ list_transpose <- function(x, simplify = NA, ptype = NULL, default = NULL) { - vec_check_list(x) + + check_list(x) check_dots_empty() if (length(x) == 0) { template <- integer() - } else { - template <- template %||% vec_index(x[[1]]) + } else if (is.null(template)) { + indexes <- map(x, vec_index) + call <- current_env() + withCallingHandlers( + template <- reduce(indexes, vec_set_union), + vctrs_error_ptype2 = function(e) { + cli::cli_abort( + "Can't combine named and unnamed vectors.", + arg = template, + call = call + ) + } + ) } if (!is.character(template) && !is.numeric(template)) { diff --git a/R/pmap.R b/R/pmap.R index b3282eb7..841818bc 100644 --- a/R/pmap.R +++ b/R/pmap.R @@ -23,7 +23,25 @@ #' * A formula, e.g. `~ ..1 + ..2 / ..3`. This syntax is not recommended as #' you can only refer to arguments by position. #' @inheritParams map -#' @inherit map return +#' @returns +#' The output length is determined by the maximum length of all elements of `.l`. +#' The output names are determined by the names of the first element of `.l`. +#' The output type is determined by the suffix: +#' +#' * No suffix: a list; `.f()` can return anything. +#' +#' * `_lgl()`, `_int()`, `_dbl()`, `_chr()` return a logical, integer, double, +#' or character vector respectively; `.f()` must return a compatible atomic +#' vector of length 1. +#' +#' * `_vec()` return an atomic or S3 vector, the same type that `.f` returns. +#' `.f` can return pretty much any type of vector, as long as it is length 1. +#' +#' * `pwalk()` returns the input `.l` (invisibly). This makes it easy to +#' use in a pipe. The return value of `.f()` is ignored. +#' +#' Any errors thrown by `.f` will be wrapped in an error with class +#' [purrr_error_indexed]. #' @family map variants #' @export #' @examples diff --git a/R/reduce.R b/R/reduce.R index d491ae1c..89c8996b 100644 --- a/R/reduce.R +++ b/R/reduce.R @@ -8,7 +8,22 @@ #' `f` over `1:3` computes the value `f(f(1, 2), 3)`. #' #' @inheritParams map -#' @param .y For `reduce2()` and `accumulate2()`, an additional +#' @param ... Additional arguments passed on to the reduce function. +#' +#' We now generally recommend against using `...` to pass additional +#' (constant) arguments to `.f`. Instead use a shorthand anonymous function: +#' +#' ```R +#' # Instead of +#' x |> reduce(f, 1, 2, collapse = ",") +#' # do: +#' x |> reduce(\(x, y) f(x, y, 1, 2, collapse = ",")) +#' ``` +#' +#' This makes it easier to understand which arguments belong to which +#' function and will tend to yield better error messages. +#' +#' @param .y For `reduce2()` an additional #' argument that is passed to `.f`. If `init` is not set, `.y` #' should be 1 element shorter than `.x`. #' @param .f For `reduce()`, a 2-argument function. The function will be passed diff --git a/R/superseded-map-df.R b/R/superseded-map-df.R index 065aa9c9..8b9fb0d1 100644 --- a/R/superseded-map-df.R +++ b/R/superseded-map-df.R @@ -41,6 +41,23 @@ #' map(\(mod) as.data.frame(t(as.matrix(coef(mod))))) |> #' list_rbind() #' +#' # for certain pathological inputs `map_dfr()` and `map_dfc()` actually +#' # both combine the list by column +#' df <- data.frame( +#' x = c(" 13", " 15 "), +#' y = c(" 34", " 67 ") +#' ) +#' +#' # Was: +#' map_dfr(df, trimws) +#' map_dfc(df, trimws) +#' +#' # But list_rbind()/list_cbind() fail because they require data frame inputs +#' try(map(df, trimws) |> list_rbind()) +#' +#' # Instead, use modify() to apply a function to each column of a data frame +#' modify(df, trimws) +#' #' # map2 --------------------------------------------- #' #' ex_fun <- function(arg1, arg2){ diff --git a/man/imap.Rd b/man/imap.Rd index ae081b01..b453d245 100644 --- a/man/imap.Rd +++ b/man/imap.Rd @@ -6,6 +6,7 @@ \alias{imap_chr} \alias{imap_int} \alias{imap_dbl} +\alias{imap_vec} \alias{iwalk} \title{Apply a function to each element of a vector, and its index} \usage{ @@ -19,6 +20,8 @@ imap_int(.x, .f, ...) imap_dbl(.x, .f, ...) +imap_vec(.x, .f, ...) + iwalk(.x, .f, ...) } \arguments{ diff --git a/man/list_simplify.Rd b/man/list_simplify.Rd index f072b1f6..1b16e3b2 100644 --- a/man/list_simplify.Rd +++ b/man/list_simplify.Rd @@ -11,9 +11,9 @@ list_simplify(x, ..., strict = TRUE, ptype = NULL) \item{...}{These dots are for future extensions and must be empty.} -\item{strict}{What should happen if simplification fails? If \code{TRUE}, -it will error. If \code{FALSE} and \code{ptype} is not supplied, it will return \code{x} -unchanged.} +\item{strict}{What should happen if simplification fails? If \code{TRUE} +(the default) it will error. If \code{FALSE} and \code{ptype} is not supplied, +it will return \code{x} unchanged.} \item{ptype}{An optional prototype to ensure that the output type is always the same.} diff --git a/man/list_transpose.Rd b/man/list_transpose.Rd index 0dc8aabc..1e86d708 100644 --- a/man/list_transpose.Rd +++ b/man/list_transpose.Rd @@ -20,9 +20,9 @@ list_transpose( \item{template}{A "template" that describes the output list. Can either be a character vector (where elements are extracted by name), or an integer -vector (where elements are extracted by position). Defaults to the names -of the first element of \code{x}, or if they're not present, the integer -indices.} +vector (where elements are extracted by position). Defaults to the union +of the names of the elements of \code{x}, or if they're not present, the +union of the integer indices.} \item{simplify}{Should the result be \link[=list_simplify]{simplified}? \itemize{ diff --git a/man/map_dfr.Rd b/man/map_dfr.Rd index 3274f223..f9c9dd43 100644 --- a/man/map_dfr.Rd +++ b/man/map_dfr.Rd @@ -71,6 +71,23 @@ mtcars |> map(\(mod) as.data.frame(t(as.matrix(coef(mod))))) |> list_rbind() +# for certain pathological inputs `map_dfr()` and `map_dfc()` actually +# both combine the list by column +df <- data.frame( + x = c(" 13", " 15 "), + y = c(" 34", " 67 ") +) + +# Was: +map_dfr(df, trimws) +map_dfc(df, trimws) + +# But list_rbind()/list_cbind() fail because they require data frame inputs +try(map(df, trimws) |> list_rbind()) + +# Instead, use modify() to apply a function to each column of a data frame +modify(df, trimws) + # map2 --------------------------------------------- ex_fun <- function(arg1, arg2){ diff --git a/man/pmap.Rd b/man/pmap.Rd index 7b1d872f..65babeef 100644 --- a/man/pmap.Rd +++ b/man/pmap.Rd @@ -67,8 +67,8 @@ of the elements of the result. Otherwise, supply a "prototype" giving the desired type of output.} } \value{ -The output length is determined by the length of the input. -The output names are determined by the input names. +The output length is determined by the maximum length of all elements of \code{.l}. +The output names are determined by the names of the first element of \code{.l}. The output type is determined by the suffix: \itemize{ \item No suffix: a list; \code{.f()} can return anything. @@ -76,8 +76,8 @@ The output type is determined by the suffix: or character vector respectively; \code{.f()} must return a compatible atomic vector of length 1. \item \verb{_vec()} return an atomic or S3 vector, the same type that \code{.f} returns. -\code{.f} can return pretty much any type of vector, as long as its length 1. -\item \code{walk()} returns the input \code{.x} (invisibly). This makes it easy to +\code{.f} can return pretty much any type of vector, as long as it is length 1. +\item \code{pwalk()} returns the input \code{.l} (invisibly). This makes it easy to use in a pipe. The return value of \code{.f()} is ignored. } diff --git a/man/reduce.Rd b/man/reduce.Rd index 69ea6388..470f5f73 100644 --- a/man/reduce.Rd +++ b/man/reduce.Rd @@ -23,15 +23,15 @@ second argument, and the next value of \code{.y} as the third argument. The reduction terminates early if \code{.f} returns a value wrapped in a \code{\link[=done]{done()}}.} -\item{...}{Additional arguments passed on to the mapped function. +\item{...}{Additional arguments passed on to the reduce function. We now generally recommend against using \code{...} to pass additional (constant) arguments to \code{.f}. Instead use a shorthand anonymous function: \if{html}{\out{