Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade faiss to 12b92e9 #1509

Merged
merged 5 commits into from
Mar 6, 2024
Merged

Conversation

jmazanec15
Copy link
Member

@jmazanec15 jmazanec15 commented Mar 5, 2024

Description

Upgrades faiss from 32f0e8c to 12b92e9.

As part of this, we are removing the jni/patches/faiss/0002-Custom-patch-to-support-sqfp16-neon.patch patch because it is already present in faiss.

In future, we will need to remove this line: https://github.com/opensearch-project/k-NN/blob/main/scripts/build.sh#L100 in order to enable neon support. However, this will require gcc 9 in the distribution build. So we wont change that until that change is made.

Also absent in this change is updating faiss version in Faiss.java (see #1515 for more details)

Issues Resolved

#1508

Testing

Manually confirmed 0002-Custom-patch-to-support-AVX2-Linux-CI.patch works:

git apply --ignore-space-change --ignore-whitespace --3way ../../patches/faiss/0002-Custom-patch-to-support-AVX2-Linux-CI.patch
Applied patch to 'faiss/impl/code_distance/code_distance-avx2.h' cleanly.

Check List

  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jmazanec15 jmazanec15 added skip-changelog Maintenance Add support for new versions of OpenSearch/Dashboards from upstream v2.13.0 labels Mar 5, 2024
Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.11%. Comparing base (231ad93) to head (7718804).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1509      +/-   ##
============================================
+ Coverage     85.09%   85.11%   +0.01%     
- Complexity     1280     1281       +1     
============================================
  Files           168      168              
  Lines          5232     5232              
  Branches        495      495              
============================================
+ Hits           4452     4453       +1     
  Misses          572      572              
+ Partials        208      207       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jmazanec15 jmazanec15 force-pushed the issue-1508 branch 20 times, most recently from fa6738f to dd9edb3 Compare March 6, 2024 19:21
Changes submodule commit to 12b92e9.

Signed-off-by: John Mazanec <[email protected]>
Copy link
Member

@ryanbogan ryanbogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jmazanec15 jmazanec15 merged commit 1303182 into opensearch-project:main Mar 6, 2024
51 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 6, 2024
Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 1303182)
junqiu-lei pushed a commit to junqiu-lei/k-NN that referenced this pull request Mar 7, 2024
Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 1303182)
junqiu-lei added a commit that referenced this pull request Mar 7, 2024
* Manually install zlib for win CI (#1513)

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 231ad93)

* Upgrade faiss to 12b92e9 (#1509)

Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit 1303182)

* Disable sdc table for HNSWPQ read-only indices (#1518)

Passes flag to disable sdc table for the HNSWPQ indices. This table is
only used by HNSWPQ during graph creation to compare nodes already
present in graph. When we call load index, the graph is read only.
Hence, we wont be doing any ingestion and so the table can be disabled
to save some memory.

Along with this, added a unit test and a couple test helper methods for
generating random data.

Signed-off-by: John Mazanec <[email protected]>
(cherry picked from commit c9262f5)

---------

Co-authored-by: John Mazanec <[email protected]>
junqiu-lei added a commit that referenced this pull request Mar 12, 2024
* Optimize Faiss Query With Filters: Reduce iteration and memory for id filter (#1402)

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator

Signed-off-by: luyuncheng <[email protected]>

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator.
Use Bitmap And Batch to do id filter. and you sparse or fixed bitset do exact ANN search

Signed-off-by: luyuncheng <[email protected]>

* Using int64_t instead of long type for GetLongArrayElements

Signed-off-by: luyuncheng <[email protected]>

* Add IDSelectorJlongBitmap

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class
3. Spotless apply

Signed-off-by: luyuncheng <[email protected]>

* Rebase remote-tracking branch 'origin/main' into Filter

Signed-off-by: luyuncheng <[email protected]>

* tidy

Signed-off-by: luyuncheng <[email protected]>

* Add Changelog

Signed-off-by: luyuncheng <[email protected]>

* fix javadoc tasks

Signed-off-by: luyuncheng <[email protected]>

* fix bwc javadoc

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* Rebase faiss_wrapper.cpp

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector as Byte.SIZE

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For comments

Signed-off-by: luyuncheng <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>

* Increment 2.12.0-SNAPSHOT to 2.13.0-SNAPSHOT in BWC workflow (#1505)

Signed-off-by: Varun Jain <[email protected]>

* Manually install zlib for win CI (#1513)

Signed-off-by: John Mazanec <[email protected]>

* Upgrade faiss to 12b92e9 (#1509)

Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>

* Disable sdc table for HNSWPQ read-only indices (#1518)

Passes flag to disable sdc table for the HNSWPQ indices. This table is
only used by HNSWPQ during graph creation to compare nodes already
present in graph. When we call load index, the graph is read only.
Hence, we wont be doing any ingestion and so the table can be disabled
to save some memory.

Along with this, added a unit test and a couple test helper methods for
generating random data.

Signed-off-by: John Mazanec <[email protected]>

* Support distance type radius search for Lucene engine

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Add RNNQueryFactory class

Signed-off-by: Junqiu Lei <[email protected]>

* Add javadoc

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Co-authored-by: luyuncheng <[email protected]>
Co-authored-by: Varun Jain <[email protected]>
Co-authored-by: John Mazanec <[email protected]>
junqiu-lei added a commit to junqiu-lei/k-NN that referenced this pull request Mar 15, 2024
…ject#1498)

* Optimize Faiss Query With Filters: Reduce iteration and memory for id filter (opensearch-project#1402)

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator

Signed-off-by: luyuncheng <[email protected]>

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator.
Use Bitmap And Batch to do id filter. and you sparse or fixed bitset do exact ANN search

Signed-off-by: luyuncheng <[email protected]>

* Using int64_t instead of long type for GetLongArrayElements

Signed-off-by: luyuncheng <[email protected]>

* Add IDSelectorJlongBitmap

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class
3. Spotless apply

Signed-off-by: luyuncheng <[email protected]>

* Rebase remote-tracking branch 'origin/main' into Filter

Signed-off-by: luyuncheng <[email protected]>

* tidy

Signed-off-by: luyuncheng <[email protected]>

* Add Changelog

Signed-off-by: luyuncheng <[email protected]>

* fix javadoc tasks

Signed-off-by: luyuncheng <[email protected]>

* fix bwc javadoc

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* Rebase faiss_wrapper.cpp

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector as Byte.SIZE

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For comments

Signed-off-by: luyuncheng <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>

* Increment 2.12.0-SNAPSHOT to 2.13.0-SNAPSHOT in BWC workflow (opensearch-project#1505)

Signed-off-by: Varun Jain <[email protected]>

* Manually install zlib for win CI (opensearch-project#1513)

Signed-off-by: John Mazanec <[email protected]>

* Upgrade faiss to 12b92e9 (opensearch-project#1509)

Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>

* Disable sdc table for HNSWPQ read-only indices (opensearch-project#1518)

Passes flag to disable sdc table for the HNSWPQ indices. This table is
only used by HNSWPQ during graph creation to compare nodes already
present in graph. When we call load index, the graph is read only.
Hence, we wont be doing any ingestion and so the table can be disabled
to save some memory.

Along with this, added a unit test and a couple test helper methods for
generating random data.

Signed-off-by: John Mazanec <[email protected]>

* Support distance type radius search for Lucene engine

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Add RNNQueryFactory class

Signed-off-by: Junqiu Lei <[email protected]>

* Add javadoc

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Co-authored-by: luyuncheng <[email protected]>
Co-authored-by: Varun Jain <[email protected]>
Co-authored-by: John Mazanec <[email protected]>
junqiu-lei added a commit to junqiu-lei/k-NN that referenced this pull request Mar 19, 2024
…ject#1498)

* Optimize Faiss Query With Filters: Reduce iteration and memory for id filter (opensearch-project#1402)

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator

Signed-off-by: luyuncheng <[email protected]>

* Optimize Faiss Query With Filters. Reduce iteration copy for docid set iterator.
Use Bitmap And Batch to do id filter. and you sparse or fixed bitset do exact ANN search

Signed-off-by: luyuncheng <[email protected]>

* Using int64_t instead of long type for GetLongArrayElements

Signed-off-by: luyuncheng <[email protected]>

* Add IDSelectorJlongBitmap

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class

Signed-off-by: luyuncheng <[email protected]>

* 1. Add IDSelectorJlongBitmap and UT for it
2. Move FilterIdsSelectorType to a util class
3. Spotless apply

Signed-off-by: luyuncheng <[email protected]>

* Rebase remote-tracking branch 'origin/main' into Filter

Signed-off-by: luyuncheng <[email protected]>

* tidy

Signed-off-by: luyuncheng <[email protected]>

* Add Changelog

Signed-off-by: luyuncheng <[email protected]>

* fix javadoc tasks

Signed-off-by: luyuncheng <[email protected]>

* fix bwc javadoc

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector

Signed-off-by: luyuncheng <[email protected]>

* Rebase faiss_wrapper.cpp

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For description Select different FilterIdsSelectorType

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector as Byte.SIZE

Signed-off-by: luyuncheng <[email protected]>

* UpdatedFilterIdsSelector For comments

Signed-off-by: luyuncheng <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>

* Increment 2.12.0-SNAPSHOT to 2.13.0-SNAPSHOT in BWC workflow (opensearch-project#1505)

Signed-off-by: Varun Jain <[email protected]>

* Manually install zlib for win CI (opensearch-project#1513)

Signed-off-by: John Mazanec <[email protected]>

* Upgrade faiss to 12b92e9 (opensearch-project#1509)

Upgrades faiss to facebookresearch/faiss@12b92e9. Cleanup outdated patches.

Signed-off-by: John Mazanec <[email protected]>

* Disable sdc table for HNSWPQ read-only indices (opensearch-project#1518)

Passes flag to disable sdc table for the HNSWPQ indices. This table is
only used by HNSWPQ during graph creation to compare nodes already
present in graph. When we call load index, the graph is read only.
Hence, we wont be doing any ingestion and so the table can be disabled
to save some memory.

Along with this, added a unit test and a couple test helper methods for
generating random data.

Signed-off-by: John Mazanec <[email protected]>

* Support distance type radius search for Lucene engine

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve comments

Signed-off-by: Junqiu Lei <[email protected]>

* Add RNNQueryFactory class

Signed-off-by: Junqiu Lei <[email protected]>

* Add javadoc

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

* Resolve feedback

Signed-off-by: Junqiu Lei <[email protected]>

---------

Signed-off-by: luyuncheng <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: John Mazanec <[email protected]>
Signed-off-by: Junqiu Lei <[email protected]>
Co-authored-by: luyuncheng <[email protected]>
Co-authored-by: Varun Jain <[email protected]>
Co-authored-by: John Mazanec <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Maintenance Add support for new versions of OpenSearch/Dashboards from upstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants