Skip to content

Release candidate for 1.4.1

Latest
Compare
Choose a tag to compare
@nmcdonnell-kx nmcdonnell-kx released this 21 May 17:34
33f5c95

Note: the 1.4.1-rc.1 arrowkdb package was built against Apache Arrow version 9.0.0. If you have a different version of the libarrow runtime installed, it may be necessary to build arrowkdb from source in order to support that version (instructions to build arrowkdb from source are in the README.md).

Arrow only supports a single string array containing up to 2GB of data. If the kdb string/symbol list contains more than this amount of data then it has to be populated into an Arrow chunked array. Chunked arrays were already support by arrowkdb when writing Arrow IPC files or streams, but not when when writing Parquet files.

Therefore, in order to support the used of chunked arrays when writing Parquet files, the ARROW_CHUNK_ROWS option has been added to:

  • pq.writeParquet
  • pq.writeParquetFromTable

Note: This only applies to how kdb lists are chunked internally to the Parquet file writer. This is different to the row groups configuration (set using PARQUET_CHUNK_SIZE) which controls how the Parquet file is structured when written.