Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Incorrect faiss::IDSelectorBitmap size in faiss_wrapper.cpp #1435

Closed
ErikML opened this issue Jan 31, 2024 · 4 comments
Closed

[BUG] Incorrect faiss::IDSelectorBitmap size in faiss_wrapper.cpp #1435

ErikML opened this issue Jan 31, 2024 · 4 comments
Assignees
Labels
bug Something isn't working v2.12.0

Comments

@ErikML
Copy link

ErikML commented Jan 31, 2024

What is the bug?
The faiss::IDSelectorBitmap is given the incorrect size. It is created as new faiss::IDSelectorBitmap(filterIdsLength, bitmap.data()). It should be created as new faiss::IDSelectorBitmap(bitsetArraySize, bitmap.data()).

In particular, this can cause faiss to incorrectly filter valid candidates when filterIdsLength < bitsetArraySize, as faiss rejects all candidates with ids larger than the IDSelectorBitmap size.

How can one reproduce the bug?
Steps to reproduce the behavior:

Create a filter where the number of elements and maxIdValue satisfies maxIdValue / (8 * sizeof(faiss::idx_t)) < filterIdsLength < maxIdValue / 8. Also make the element with the largest id the element with the highest score

This should trigger the bitmap filter and also make the bitmap size smaller than the number of entries. We don't return the highest scoring element even if faiss considers it because faiss filters it out.

What is the expected behavior?
Faiss should return the highest scoring element if it is considered.

What is your host/environment?
OS 2.11.0 with k-NN plugin 2.11.0.0.

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.
FaissFilterRun

Do you have any additional context?
No.

@ErikML ErikML added bug Something isn't working untriaged labels Jan 31, 2024
@heemin32
Copy link
Collaborator

If merged, this PR will resolve the issue. #1402

@ErikML
Copy link
Author

ErikML commented Jan 31, 2024

It is also possible to over-read the buffer when filterIdsLength is larger than bitsetArraySize. This can allow returning datapoints that do not satisfy the filter.
FilterBufferOverrun

@heemin32
Copy link
Collaborator

@ErikML, Thanks for reporting the bug. This is a great finding.

@heemin32
Copy link
Collaborator

heemin32 commented Feb 1, 2024

Raised a PR to push the fix in 2.12 release.

@heemin32 heemin32 self-assigned this Feb 1, 2024
@heemin32 heemin32 added v2.12.0 and removed untriaged labels Feb 1, 2024
@heemin32 heemin32 closed this as completed Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.12.0
Projects
None yet
Development

No branches or pull requests

2 participants