Skip to content

v0.8.13 manifest bug fix and vector indexing perf improvements

Compare
Choose a tag to compare
@chebbyChefNEQ chebbyChefNEQ released this 07 Nov 20:32
· 1102 commits to main since this release

Critical fix: tables written prior to v0.8.0 may have corrupted stats

If a table was written with a Lance version prior to v0.8.0, and then later written by a version >=0.8.0<=0.8.13, it may have incorrect statistics. You can detect whether this affects your table using the LanceDataset.validate() method. If this affects your table, Lance versions prior to 0.8.13 may not be able to read the table correct. If you do not plan on using older versions of Lance going forward, no action is needed. To fix reads on older Lance versions, commit any write transaction to the table with Lance v0.8.13 or newer. A simple way to make a transaction without changing the data would be:

import lance

dataset = lance.dataset('...')
operation = lance.LanceOperation.Append([])
dataset = lance.LanceDataset.commit(
    dataset.uri,
    operation,
    read_version=dataset.version,
)

(This makes an empty Append commit)

New features

Bug fixes

  • fix: add versioning and bypass broken row counts by @wjones127 in #1534
  • fix: fix assertion of cosine values by @eddyxu in #1530
  • fix: pq index does not handle dot product metric correctly during search by @rok in #1536

Performance improvements

  • perf: improve f16 performance for norm L2 on aarch64 by @eddyxu in #1539

Other changes

  • chore: move scalar_index benchmark to break circular dependency by @westonpace in #1540

New Contributors

Full Changelog: v0.8.12...v0.8.13