Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

zhichao-aws · 2024-08-13T05:17:34Z

Description

Create blog: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models

Issues Resolved

#3164

Check List

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: zhichao-aws <[email protected]>

zhichao-aws · 2024-08-13T05:18:49Z

The blog should be released after the update to current documentation on supported sparse models. I'll give update on that PR here

Signed-off-by: zhichao-aws <[email protected]>

zhichao-aws · 2024-08-16T01:53:49Z

The PR link for documentation: opensearch-project/documentation-website#7987

kolchfa-aws · 2024-08-16T15:21:51Z

@zhichao-aws The doc PR is merged.

Signed-off-by: Fanit Kolchina <[email protected]>

natebower

@zhichao-aws @pajuric Editorial review complete. Please see my comments and changes and let me know if you have any questions. Thanks!

_posts/2024-08-19-neural-sparse-v2-models.md

natebower · 2024-08-20T10:02:05Z

_posts/2024-08-19-neural-sparse-v2-models.md

+Neural sparse search is a novel and efficient method for semantic retrieval, [introduced in OpenSearch 2.11](https://opensearch.org/blog/improving-document-retrieval-with-sparse-semantic-encoders/). Sparse encoding models encode text into (token, weight) entries, allowing OpenSearch to build indexes and perform searches using Lucene's inverted index. Neural sparse search is efficient and generalizes well in out-of-domain (OOD) scenarios. We are excited to announce the release of our v2 series neural sparse models:
+
+- **v2-distill model**: This model **reduces model parameters by 50%**, resulting in lower memory requirements and costs. It **increases ingestion throughput by 1.39 on GPU and 1.74x on CPU**. The v2-distill architecture supports both doc-only and bi-encoder modes.
+- **v2-mini model**: This model **reduces model parameters by 75%**, also reducing memory requirements and costs. It **increases ingestion throughput by 1.74x on GPU and 4.18x on CPU**. The v2-mini architecture supports the doc-only mode.


Here and on the line above, should it be "GPUs" and "CPUs" (plural)?

_posts/2024-08-19-neural-sparse-v2-models.md

natebower · 2024-08-20T11:18:50Z

_posts/2024-08-19-neural-sparse-v2-models.md

+
+1. Register and deploy a tokenizer for search:
+
+    ```


_posts/2024-08-19-neural-sparse-v2-models.md

kolchfa-aws · 2024-08-20T13:03:03Z

_posts/2024-08-19-neural-sparse-v2-models.md

+
+In these experiments, we ingested 1 million documents into an index and used 20 clients to perform concurrent searches. We recorded the p99 for both client-side search and model inference. We tested search performance for the **bi-encoder** mode.
+
+#### Remote deployment using GPU 


Suggested change

#### Remote deployment using GPU

#### Remote deployment on a GPU

_posts/2024-08-19-neural-sparse-v2-models.md

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

_posts/2024-08-19-neural-sparse-v2-models.md

Signed-off-by: kolchfa-aws <[email protected]>

_posts/2024-08-19-neural-sparse-v2-models.md

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws · 2024-08-20T13:10:34Z

@pajuric @zhichao-aws Editorial comments are implemented and the blog is ready to publish. Thanks!

pajuric · 2024-08-20T20:37:45Z

_posts/2024-08-19-neural-sparse-v2-models.md

+    - technical-posts
+has_science_table: true
+meta_keywords: OpenSearch semantic search, neural sparse search, semantic sparse retrieval
+meta_description: Accelerating inference and improving search with v2 neural sparse encoding models


Please update the blog with the following meta:

meta_keywords: neural sparse models, OpenSearch semantic search, semantic sparse retrieval, neural search

meta_description: OpenSearch announces the availability of v2 series neural sparse models that enhance the efficiency of semantic sparse retrieval while accelerating inference and improving search.

pajuric · 2024-08-20T20:38:11Z

_posts/2024-08-19-neural-sparse-v2-models.md

+  - yych
+  - dylantong
+  - kolchfa
+date: 2024-08-19


Please update the date to 2024-08-21

Signed-off-by: Fanit Kolchina <[email protected]>

pajuric · 2024-08-21T19:04:47Z

@nateynateynate @krisfreedain - Blog is ready to publish today! Let's ship it.

krisfreedain

please set featured_blog_post: false

_posts/2024-08-19-neural-sparse-v2-models.md

setting "featured_blog_post: false" while we continue to promote OpenSearchCon Signed-off-by: Kris Freedain <[email protected]>

add blog content

f5aecd6

Signed-off-by: zhichao-aws <[email protected]>

zhichao-aws requested review from elfisher, AMoo-Miki, nknize, krisfreedain, peterzhuamazon, CEHENKLE, dtaivpp, kolchfa-aws, nateynateynate and natebower as code owners August 13, 2024 05:17

zhichao-aws added 2 commits August 14, 2024 11:19

Merge branch 'main' into blog_sparse_model_v2

7f98fef

add further reading

909c765

Signed-off-by: zhichao-aws <[email protected]>

kolchfa-aws added 2 commits August 16, 2024 17:02

Doc review

38f8575

Signed-off-by: Fanit Kolchina <[email protected]>

Correct dataset name

fea60ef

Signed-off-by: Fanit Kolchina <[email protected]>

natebower reviewed Aug 20, 2024

View reviewed changes

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Show resolved Hide resolved

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

Apply suggestions from code review

3201377

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

Update _posts/2024-08-19-neural-sparse-v2-models.md

4930c27

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws reviewed Aug 20, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

Apply suggestions from code review

e5f0f09

Signed-off-by: kolchfa-aws <[email protected]>

pajuric reviewed Aug 20, 2024

View reviewed changes

Added meta keywords and changed date

4ad4073

Signed-off-by: Fanit Kolchina <[email protected]>

krisfreedain requested changes Aug 21, 2024

View reviewed changes

_posts/2024-08-19-neural-sparse-v2-models.md Outdated Show resolved Hide resolved

Update 2024-08-19-neural-sparse-v2-models.md

570886d

setting "featured_blog_post: false" while we continue to promote OpenSearchCon Signed-off-by: Kris Freedain <[email protected]>

krisfreedain approved these changes Aug 21, 2024

View reviewed changes

krisfreedain merged commit 15d37dc into opensearch-project:main Aug 21, 2024
5 checks passed

krisfreedain mentioned this pull request Aug 21, 2024

Push out new blog post #3211

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

zhichao-aws commented Aug 13, 2024

zhichao-aws commented Aug 13, 2024

zhichao-aws commented Aug 16, 2024

kolchfa-aws commented Aug 16, 2024

natebower left a comment

natebower Aug 20, 2024

natebower Aug 20, 2024

kolchfa-aws Aug 20, 2024

kolchfa-aws commented Aug 20, 2024

pajuric Aug 20, 2024

kolchfa-aws Aug 20, 2024

pajuric Aug 20, 2024

kolchfa-aws Aug 20, 2024

pajuric commented Aug 21, 2024

krisfreedain left a comment


		In these experiments, we ingested 1 million documents into an index and used 20 clients to perform concurrent searches. We recorded the p99 for both client-side search and model inference. We tested search performance for the bi-encoder mode.

		#### Remote deployment using GPU

	#### Remote deployment using GPU
	#### Remote deployment on a GPU

Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

Blog Post: Advancing Search Quality and Inference Speed with v2 Series Neural Sparse Models #3169

Conversation

zhichao-aws commented Aug 13, 2024

Description

Issues Resolved

Check List

zhichao-aws commented Aug 13, 2024

zhichao-aws commented Aug 16, 2024

kolchfa-aws commented Aug 16, 2024

natebower left a comment

Choose a reason for hiding this comment

natebower Aug 20, 2024

Choose a reason for hiding this comment

natebower Aug 20, 2024

Choose a reason for hiding this comment

kolchfa-aws Aug 20, 2024

Choose a reason for hiding this comment

kolchfa-aws commented Aug 20, 2024

pajuric Aug 20, 2024

Choose a reason for hiding this comment

kolchfa-aws Aug 20, 2024

Choose a reason for hiding this comment

pajuric Aug 20, 2024

Choose a reason for hiding this comment

kolchfa-aws Aug 20, 2024

Choose a reason for hiding this comment

pajuric commented Aug 21, 2024

krisfreedain left a comment

Choose a reason for hiding this comment