-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix IndexIVFFastScan reconstruct_from_offset method #4095
base: main
Are you sure you want to change the base?
Conversation
Thanks for the contribution! Can you add unit tests, in Python should be sufficient, for the two cases you referenced?
|
I added unit tests, and fixed another bug in The bug involves incorrect usage of invlists vs orig_invlists when getting the list size. The original code was using orig_invlists->list_size() which is incorrect since we want to read from the source invlists when getting codes and sizes. The fix changes it to use invlists->list_size() to be consistent with where we're reading the codes and IDs from. |
LGTM |
@mdouze has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Resolves issue #4089 - IndexIVFPQFastScan crashes with certain nlist values
The
reconstruct_from_offset
method inIndexIVFFastScan
was incorrectly reconstructing vectors, causing crashes when thenlist
parameter was not byte-aligned (e.g. 100 instead of 256).The root cause was that the
list_no
(Voronoi cell number) was not being properly encoded into thecode
vector before passing it to thesa_decode
function. This resulted in invalidlist_no
values being read insa_decode
, triggering the assertion failure'list_no >= 0 && list_no < nlist'
whennlist
in some cases.This PR fixes the issue with the following changes to
reconstruct_from_offset
:list_no
into the beginning of thecode
vector using the existingencode_listno
methodBitstringWriter
after the coarse code portion ofcode
(shifted bycoarse_code_size()
bytes)sa_decode
After these changes:
nlist
valueIndexIVFPQ
Fixes #4089
Please review and let me know if any changes are needed. Thanks!