Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

Merged
merged 10 commits into from
Aug 21, 2024

Conversation

zhichao-aws
Copy link
Member

Description

Create blog: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models

Issues Resolved

#3164

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: zhichao-aws <[email protected]>
@zhichao-aws
Copy link
Member Author

The blog should be released after the update to current documentation on supported sparse models. I'll give update on that PR here

@zhichao-aws
Copy link
Member Author

The PR link for documentation: opensearch-project/documentation-website#7987

@kolchfa-aws
Copy link
Collaborator

@zhichao-aws The doc PR is merged.

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhichao-aws @pajuric Editorial review complete. Please see my comments and changes and let me know if you have any questions. Thanks!

Neural sparse search is a novel and efficient method for semantic retrieval, [introduced in OpenSearch 2.11](https://opensearch.org/blog/improving-document-retrieval-with-sparse-semantic-encoders/). Sparse encoding models encode text into (token, weight) entries, allowing OpenSearch to build indexes and perform searches using Lucene's inverted index. Neural sparse search is efficient and generalizes well in out-of-domain (OOD) scenarios. We are excited to announce the release of our v2 series neural sparse models:

- **v2-distill model**: This model **reduces model parameters by 50%**, resulting in lower memory requirements and costs. It **increases ingestion throughput by 1.39 on GPU and 1.74x on CPU**. The v2-distill architecture supports both doc-only and bi-encoder modes.
- **v2-mini model**: This model **reduces model parameters by 75%**, also reducing memory requirements and costs. It **increases ingestion throughput by 1.74x on GPU and 4.18x on CPU**. The v2-mini architecture supports the doc-only mode.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and on the line above, should it be "GPUs" and "CPUs" (plural)?


1. Register and deploy a tokenizer for search:

```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json?


In these experiments, we ingested 1 million documents into an index and used 20 clients to perform concurrent searches. We recorded the p99 for both client-side search and model inference. We tested search performance for the **bi-encoder** mode.

#### Remote deployment using GPU
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Remote deployment using GPU
#### Remote deployment on a GPU

Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
@kolchfa-aws
Copy link
Collaborator

@pajuric @zhichao-aws Editorial comments are implemented and the blog is ready to publish. Thanks!

- technical-posts
has_science_table: true
meta_keywords: OpenSearch semantic search, neural sparse search, semantic sparse retrieval
meta_description: Accelerating inference and improving search with v2 neural sparse encoding models
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the blog with the following meta:

meta_keywords: neural sparse models, OpenSearch semantic search, semantic sparse retrieval, neural search

meta_description: OpenSearch announces the availability of v2 series neural sparse models that enhance the efficiency of semantic sparse retrieval while accelerating inference and improving search.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- yych
- dylantong
- kolchfa
date: 2024-08-19
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the date to 2024-08-21

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@pajuric
Copy link

pajuric commented Aug 21, 2024

@nateynateynate @krisfreedain - Blog is ready to publish today! Let's ship it.

Copy link
Member

@krisfreedain krisfreedain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please set featured_blog_post: false

setting "featured_blog_post: false" while we continue to promote OpenSearchCon

Signed-off-by: Kris Freedain <[email protected]>
@krisfreedain krisfreedain merged commit 15d37dc into opensearch-project:main Aug 21, 2024
5 checks passed
@krisfreedain krisfreedain mentioned this pull request Aug 21, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants