Update _posts/2024-04-20-16.0.0-release.md

Co-authored-by: Alenka Frim <[email protected]>
apache · Apr 26, 2024 · 8177fc6 · 8177fc6
1 parent fa9e452
commit 8177fc6
Showing 1 changed file with 41 additions and 0 deletions.
diff --git a/_posts/2024-04-20-16.0.0-release.md b/_posts/2024-04-20-16.0.0-release.md
@@ -131,6 +131,47 @@ Thanks for your contributions and participation in the project!
 
 ## Python notes
 
+Compatibility notes:
+* To ensure PyArrow compatibility with NumPy 2.0 umbrella issue has been closed [GH-39532](https://github.com/apache/arrow/issues/39532) with last issues included in 16.0.0 Arrow release ([GH-41098](https://github.com/apache/arrow/issues/41098), [GH-39848](https://github.com/apache/arrow/issues/39848) and [GH-40376](https://github.com/apache/arrow/issues/40376)).
+* We no longer use internals to create Block objects and started using new pandas API with pandas version 3 [GH-35081](https://github.com/apache/arrow/issues/35081)
+* Pandas compatibility code has been simplified as old pandas and Python versions are not supported anymore [GH-40720](https://github.com/apache/arrow/issues/40720)
+* Deprecated `pyarrow.filesystem` legacy implementations have been removed [GH-20127](https://github.com/apache/arrow/issues/20127)
+
+New features:
+* Converting Arrow `Table` and `RecordBatch` to a `Tensor` (not the same as [tensor extension array](https://arrow.apache.org/docs/dev/format/CanonicalExtensions.html#official-list)) is being developed in Arrow C++ with bindings in Python. Umbrella issue: ([GH-40058](https://github.com/apache/arrow/issues/40058)). In current release the option to convert a `RecordBatch` to `Tensor` with `pyarrow.RecordBatch.to_tensor(...)` is added returning a row or column major tensor with an option of writing missing values as `NaN` in the result.
+* `ListView` and `LargeListView` array formats are now supported by PyArrow ([GH-39812](https://github.com/apache/arrow/issues/39812), [GH-39855](https://github.com/apache/arrow/issues/39855), [GH-40205](https://github.com/apache/arrow/issues/40205), [GH-41039](https://github.com/apache/arrow/issues/41039), [GH-40266](https://github.com/apache/arrow/issues/40266))
+* `Binary` and `StringView` are now supported in PyArrow ([GH-39651](https://github.com/apache/arrow/issues/39651), [GH-39852](https://github.com/apache/arrow/issues/39852), [GH-40092](https://github.com/apache/arrow/issues/40092))
+* Final support for Run-End Encoded arrays in PyArrow has been included (conversion to numpy and pandas [GH-40659](https://github.com/apache/arrow/issues/40659), construction in `pa.array(...)` [GH-40273](https://github.com/apache/arrow/issues/40273))
+* `AsofJoinNode` C++ functionality is now exposed in Python as a `join_asof` [GH-34235](https://github.com/apache/arrow/issues/34235)
+* Minimal python bindings are added for AzureFilesystem [GH-39968](https://github.com/apache/arrow/issues/39968)
+* `FixedSizeTensorScalar` class is added [GH-37484](https://github.com/apache/arrow/issues/37484)
+
+Other improvements:
+* Add ChunkedArray import/export to/from C [GH-39984](https://github.com/apache/arrow/issues/39984)
+* `pyarrow.Field` and `pyarrow.ChunkedArray` can now be constructed from objects supporting the PyCapsule Arrow C Data Interface [GH-38010](https://github.com/apache/arrow/issues/38010)
+* Requested_schema is supported in `__arrow_c_stream__` implementations [GH-40066](https://github.com/apache/arrow/issues/40066)
+* Add low-level bindings for exporting/importing the C Device Interface
+ [GH-39979](https://github.com/apache/arrow/issues/39979)
+* Function to download and extract timezone database on a Windows machine is added [GH-37328](https://github.com/apache/arrow/issues/37328)
+* Missing methods are added to `pyarrow.RecordBatch` [GH-30915](https://github.com/apache/arrow/issues/30915)
+* Dictionary is now also accepted in `pyarrow.record_batch` factory function (as in `pyarrow.table`) [GH-40291](https://github.com/apache/arrow/issues/40291)
+* Usage of scalar legacy cast has been removed [GH-40023](https://github.com/apache/arrow/issues/40023)
+* Missing byte_width attribute are added to all DataType classes [GH-39277](https://github.com/apache/arrow/issues/39277)
+* `FileInfo` instances can now be used to construct Dataset objects [GH-40142](https://github.com/apache/arrow/issues/40142)
+* Support hashing for `FileMetaData` and `ParquetSchema` [GH-39780](https://github.com/apache/arrow/issues/39780)
+* `force_virtual_addressing` is exposed in PyArrow [GH-39779](https://github.com/apache/arrow/issues/39779)
+
+Relevant bug fixes:
+* Calling `pyarrow.dataset.ParquetFileFormat.make_write_options` as a class method now returns a warning [GH-39440](https://github.com/apache/arrow/issues/39440)
+* `ScalarMemoTable`is now initiated only when deduplication is enabled which fixes large memory consumption in the other case [GH-40316](https://github.com/apache/arrow/issues/40316)
+* Slicing an array backwards beyond the start doesn't include first item ([GH-38768](https://github.com/apache/arrow/issues/38768) and [GH-40642](https://github.com/apache/arrow/issues/40642))
+* Memory leaks when creating Arrow array from Python list of dicts is fixed [GH-37989](https://github.com/apache/arrow/issues/37989)
+* `FixedSizeListType` has not been considered as a nested type and is now added to `_NESTED_TYPES` [GH-40171](https://github.com/apache/arrow/issues/40171)
+* `max_chunksize` is now validated in `Table.to_batches` [GH-39788](https://github.com/apache/arrow/issues/39788)
+* Raising `ValueError` on `_ensure_partitioning`in Dataset is fixed [GH-39579](https://github.com/apache/arrow/issues/39579)
+
+* Python stacktrace is now attached to errors in `ConvertPyError` [GH-37164](https://github.com/apache/arrow/issues/37164)
+
 ## R notes
 
 ### New features: