diff --git a/_posts/2024-07-16-17.0.0-release.md b/_posts/2024-07-16-17.0.0-release.md index 619c931f21fd..829d1f84b695 100644 --- a/_posts/2024-07-16-17.0.0-release.md +++ b/_posts/2024-07-16-17.0.0-release.md @@ -55,6 +55,89 @@ Thanks for your contributions and participation in the project! ## C++ notes +- Half-float values can now be parsed and formatted correctly (GH-41089). +- Record batches can now be converted to row-major tensors, not only column-major (GH-40866). +- The CSV writer is now able to write large string arrays that are larger than + 2 GiB (GH-40270). +- A possible invalid memory access in `BooleanArray.true_count()` has been fixed (GH-41016). +- A new method `FlattenRecursively` allows recursive nesting of list and + fixed-size list arrays (GH-41055). +- The scratch space in some `Scalar` subclasses is now immutable. This is required + for proper concurrent access to `Scalar` instances (GH-40069). +- Calling the `bit_width` or `byte_width` method of an extension type now defers + to the underlying storage type (GH-41353). +- Fixed a bug where `MapArray::FromArrays` would behave incorrectly if the given + offsets array has a non-zero offset (GH-40750). +- `MapArray::FromArrays` now accepts an optional null bitmap argument + (GH-41684). +- The `ARROW_NO_DEPRECATED_API` macro was unused and has been removed (GH-41343). +- Building with libc++ and C++20 enabled has been fixed (GH-43095). +- mimalloc is now preferred over jemalloc as the default memory pool (GH-43254). + +### Acero + +- The left anti join filter no longer crashes when the filter rows are empty (GH-41121). +- A race condition was fixed in the asof join (GH-41149). +- A potential stack overflow has been fixed (GH-41334, GH-41738). +- Potential crashes on very large data have been fixed (GH-41813, GH-43046). +- A potential data corruption on very large data has been fixed (GH-43202). + +### Compute + +- List views and maps are now supported by the `if_else`, `case_when` and + `coalesce` functions (GH-41418). +- List views are now supported by the functions `list_slice` (GH-42065), + `list_parent_indices` (GH-42235), `take` and `filter` (GH-42116). +- `list_flatten` can now be recursive based on new optional argument + (GH-41183, GH-41055) +- The `take` and `filter` functions have been made significantly faster on fixed-width + types, including fixed-size lists of fixed-width types (GH-39798). + +### Dataset + +- Repeated scanning of an encrypted Parquet dataset now works correctly (GH-41431). + +### Filesystems + +- Standard filesystem implementations are now tracked in a global registry which + also allows loading third-party filesystem implementations, for example from + runtime-loaded DLLs (GH-40342, +- Directory metadata operations on Azure filesystems are now more aligned with + the common expectations for filesystems (GH-41034). +- `CopyFile` is now supported for Azure filesystems with hierarchical namespace + enabled (GH-41095). +- Azure credentials can now be loaded explicitly from the environment (GH-39345), + or using the Azure CLI (GH-39344). +- A potential deadlock was fixed when closing an S3 output stream (GH-41862). + +### GPU + +- Non-CPU data can now be pretty-printed (GH-41664). +- Non-CPU data with offsets, such as list and binary data, can now be properly + sent over IPC (GH-42198). + +### IPC + +- Flatbuffers serialization is now more deterministic (GH-40361). + +### Parquet + +- A crash was fixed when reading an invalid Parquet file where columns claim to + be of different lengths (GH-41317). +- Definition and repetition levels are now more strictly checked, avoiding later + crashes when reading an invalid Parquet file (GH-41321). +- A crash was fixed when reading an invalid encrypted Parquet file (GH-43070). +- Fixed a bug where the BYTE_STREAM_SPLIT decoder could behave incorrectly + when nulls are present in a column (GH-41562). +- Fixed a bug where `DeltaLengthByteArrayEncoder::EstimatedDataEncodedSize` could + return an invalid estimate in some situations (GH-41545). +- Delimiting records is now faster for columns with nested repeating (GH-41361). + +### Substrait + +- Support for more Arrow data types was added: some temporal types, half floats, + large string and large binary (GH-40695). + ## C# notes ## Go Notes