From b02c0259f042615b2b40e87b37e59c6caa881041 Mon Sep 17 00:00:00 2001 From: Sutou Kouhei Date: Thu, 18 Jul 2024 17:42:19 +0900 Subject: [PATCH] Add R notes Co-authored-by: Bryce Mecum --- _posts/2024-07-16-17.0.0-release.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/_posts/2024-07-16-17.0.0-release.md b/_posts/2024-07-16-17.0.0-release.md index a935a9d1d5cb..8d536255acb6 100644 --- a/_posts/2024-07-16-17.0.0-release.md +++ b/_posts/2024-07-16-17.0.0-release.md @@ -157,6 +157,11 @@ Thanks for your contributions and participation in the project! ## R notes +* R functions that users write that use functions that Arrow supports in dataset queries now can be used in queries too. Previously, only functions that used arithmetic operators worked. For example, `time_hours <- function(mins) mins / 60` worked, but `time_hours_rounded <- function(mins) round(mins / 60)` did not; now both work. These are automatic translations rather than true user-defined functions (UDFs); for UDFs, see `register_scalar_function()`. [GH-41223](https://github.com/apache/arrow/issues/41223) +* `summarize()` supports more complex expressions, and correctly handles cases where column names are reused in expressions. [GH-41323](https://github.com/apache/arrow/issues/41323) +* The `na_matches` argument to the `dplyr::*_join()` functions is now supported. This argument controls whether `NA` values are considered equal when joining. [GH-41223](https://github.com/apache/arrow/issues/41358) +* R metadata, stored in the Arrow schema to support round-tripping data between R and Arrow/Parquet, is now serialized and deserialized more strictly. This makes it safer to load data from files from unknown sources into R data.frames. [GH-41223](https://github.com/apache/arrow/issues/41969) + For more on what’s in the 17.0.0 R package, see the [R changelog][4]. ## Ruby and C GLib notes