Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Querying Support to Lucene Byte Sized Vector #956

Conversation

naveentatikonda
Copy link
Member

Description

The changes in this PR adds querying support to lucene byte sized vector.

Issues Resolved

#812

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda naveentatikonda added Features Introduces a new unit of functionality that satisfies a requirement v2.9.0 labels Jul 7, 2023
@naveentatikonda naveentatikonda self-assigned this Jul 7, 2023
@@ -252,6 +256,15 @@ protected Query doToQuery(QueryShardContext context) {
);
}

byte[] byteVector = new byte[0];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why set this to empty array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to initialize it or else it is not letting us use it while building the QueryRequest because the logic to add values into this array is inside an if statement.

byteVector = new byte[vector.length];
for (int i = 0; i < vector.length; i++) {
validateByteVectorValue(vector[i]);
byteVector[i] = (byte) vector[i];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can we cast like this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To convert it from float to byte type. The value will still remain the same after casting because we have already validated to make sure it is within the byte range and without any decimal values.

Comment on lines 99 to 100
);
return new KnnFloatVectorQuery(fieldName, vector, k, filterQuery);
if (VectorDataType.BYTE.equals(vectorDataType)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I see there are duplicated checks. Can we just abstract them in a private function which will give us right Vector Query.

Signed-off-by: Naveen Tatikonda <[email protected]>
@navneet1v
Copy link
Collaborator

@naveentatikonda please check why github checks are failing.

Copy link
Member

@vamshin vamshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks

@naveentatikonda
Copy link
Member Author

@naveentatikonda please check why github checks are failing.

They are still failing because of the changes in core. The latest core changes are not yet synced properly to some of the dependencies. The integ tests are still failing looking for old method signature where as the unit tests are succeeding because they got synced.

@naveentatikonda naveentatikonda merged commit 8877c4c into opensearch-project:feature/lucene_byte_vector Jul 11, 2023
4 of 34 checks passed
naveentatikonda added a commit to naveentatikonda/k-NN that referenced this pull request Jul 12, 2023
)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
naveentatikonda added a commit that referenced this pull request Jul 12, 2023
* Add Indexing Support for Lucene Byte Sized Vector (#937)

* Add Indexing Support for Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add tests for Indexing

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Querying Support to Lucene Byte Sized Vector (#956)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add DocValues Support for Lucene Byte Sized Vector (#953)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Update Release Notes

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
naveentatikonda added a commit that referenced this pull request Jul 12, 2023
* Add Indexing Support for Lucene Byte Sized Vector (#937)

* Add Indexing Support for Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add tests for Indexing

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Querying Support to Lucene Byte Sized Vector (#956)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add DocValues Support for Lucene Byte Sized Vector (#953)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Update Release Notes

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
(cherry picked from commit bf04854)
naveentatikonda added a commit that referenced this pull request Jul 12, 2023
* Add Indexing Support for Lucene Byte Sized Vector (#937)

* Add Indexing Support for Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add tests for Indexing

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Querying Support to Lucene Byte Sized Vector (#956)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add DocValues Support for Lucene Byte Sized Vector (#953)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Update Release Notes

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
(cherry picked from commit bf04854)
naveentatikonda added a commit that referenced this pull request Jul 12, 2023
* Add Indexing Support for Lucene Byte Sized Vector (#937)

* Add Indexing Support for Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add tests for Indexing

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Querying Support to Lucene Byte Sized Vector (#956)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add DocValues Support for Lucene Byte Sized Vector (#953)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Update Release Notes

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
(cherry picked from commit bf04854)
Signed-off-by: Naveen Tatikonda <[email protected]>
naveentatikonda added a commit that referenced this pull request Jul 12, 2023
* Add Indexing Support for Lucene Byte Sized Vector (#937)

* Add Indexing Support for Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add tests for Indexing

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add Querying Support to Lucene Byte Sized Vector (#956)

* Add Querying Support to Lucene Byte Sized Vector

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <[email protected]>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>

* Add DocValues Support for Lucene Byte Sized Vector (#953)

Signed-off-by: Naveen Tatikonda <[email protected]>

* Update Release Notes

Signed-off-by: Naveen Tatikonda <[email protected]>

---------

Signed-off-by: Naveen Tatikonda <[email protected]>
(cherry picked from commit bf04854)
Signed-off-by: Naveen Tatikonda <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Introduces a new unit of functionality that satisfies a requirement v2.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants