Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lucene codec after lucene version bumped to 9.12 #2195

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

navneet1v
Copy link
Collaborator

@navneet1v navneet1v commented Oct 7, 2024

Description

Fix lucene codec after lucene version bumped to 9.12 Currently the version bump has happened only for main branch hence we are not doing any backport here.

This change includes:

  1. Bumping up the Lucene Codec to 912.
  2. New KNN9120Codec class added.
  3. Changes in the NativeEngineFieldVectorsWriter to pass the FlatFieldVectorsWriter since in Lucene 912, the capability of FlatFieldVectorsWriter adding the vectorValue to a passed VectorFieldWriter, the VectorFieldWriters who are using the FlatFieldVectorsWriter have to call the addValue function of FlatFieldVectorsWriter. Ref: Reduce heap usage for knn index writers apache/lucene#13538
  4. Set Compress flag for Lucene SQ bits > 4 to false. Ref: https://github.com/apache/lucene/blob/branch_9_12/lucene/core/src/java/org/apache/lucene/codecs/lucene99/Lucene99ScalarQuantizedVectorsFormat.java#L113-L116.

Related Issues

Resolves #2193

Check List

  • New functionality has been documented.
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@navneet1v navneet1v force-pushed the lucene_codec branch 2 times, most recently from bcdaeec to 614163a Compare October 7, 2024 22:02
@navneet1v navneet1v added v2.18.0 Maintenance Add support for new versions of OpenSearch/Dashboards from upstream and removed skip-changelog v2.18.0 labels Oct 7, 2024
@navneet1v
Copy link
Collaborator Author

navneet1v commented Oct 7, 2024

Seems like the min distribution is still not updated with the lucene version. Hence the builds are failing. Will check on Jenkins.

Ref: https://build.ci.opensearch.org/view/all/job/distribution-build-opensearch/

@navneet1v
Copy link
Collaborator Author

Conversation happening with the build team on this thread: https://opensearch.slack.com/archives/C04UTNM338A/p1728341414662579

@navneet1v
Copy link
Collaborator Author

navneet1v commented Oct 8, 2024

On further checking and talking to build team we found out that lucene upgrade was merged in opensearch in last 12hrs, which updated the maven repo(which happens with every merge in main branch) but the min distribution runs once in 24hrs hence min distribution is not updated.

Thanks @gaiksaya for helping here. A new build is triggered ref: https://build.ci.opensearch.org/blue/organizations/jenkins/publish-opensearch-min-snapshots/detail/publish-opensearch-min-snapshots/1818/pipeline/ Once it is completed I will re-run the GH actions.

@navneet1v
Copy link
Collaborator Author

navneet1v commented Oct 8, 2024

On doing deep-dive on the failed ITs I found out that lucene has changed the way they were using the FlatFieldVectorWriter. This will require more changes in the code to ensure that tests are passing since the changes are in the indexing path. I am working on the fix for this. Will try to raise a PR by today.

Ref: apache/lucene#13538

Earlier the Lucene99FlatVectorsWriter.FieldWriter we calling the KNNFieldWriter as a delegate. Now it is not calling anymore, hence we need to call Lucene99FlatVectorsWriter.FieldWriter.addValue from out NativeEngineFieldsVectorWriter.addValue

ryanbogan
ryanbogan previously approved these changes Oct 8, 2024
ryanbogan
ryanbogan previously approved these changes Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Maintenance Add support for new versions of OpenSearch/Dashboards from upstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix k-NN build due to lucene upgrade
3 participants