| navigation_title | mapped_pages | applies_to | ||
|---|---|---|---|---|
Semantic search with `semantic_text` |
|
This tutorial shows you how to use the semantic text feature to perform semantic search on your data.
Semantic text simplifies the {{infer}} workflow by providing {{infer}} at ingestion time and sensible default values automatically. You don’t need to define model related settings and parameters, or create {{infer}} ingest pipelines.
The recommended way to use semantic search in the {{stack}} is following the semantic_text workflow. When you need more control over indexing and query settings, you can still use the complete {{infer}} workflow (refer to this tutorial to review the process).
This tutorial uses the elasticsearch service for demonstration, but you can use any service and their supported models offered by the {{infer-cap}} API.
This tutorial uses the elasticsearch service for demonstration, which is created automatically as needed. To use the semantic_text field type with an {{infer}} service other than elasticsearch service, you must create an inference endpoint using the Create {{infer}} API.
The mapping of the destination index - the index that contains the embeddings that the inference endpoint will generate based on your input text - must be created. The destination index must have a field with the semantic_text field type to index the output of the used inference endpoint.
PUT semantic-embeddings
{
"mappings": {
"properties": {
"content": { <1>
"type": "semantic_text" <2>
}
}
}
}- The name of the field to contain the generated embeddings.
- The field to contain the embeddings is a
semantic_textfield. Since noinference_idis provided, the default endpoint.elser-2-elasticsearchfor theelasticsearchservice is used. To use a different {{infer}} service, you must create an {{infer}} endpoint first using the Create {{infer}} API and then specify it in thesemantic_textfield mapping using theinference_idparameter.
::::{note}
If you’re using web crawlers or connectors to generate indices, you have to update the index mappings for these indices to include the semantic_text field. Once the mapping is updated, you’ll need to run a full web crawl or a full connector sync. This ensures that all existing documents are reprocessed and updated with the new semantic embeddings, enabling semantic search on the updated data.
::::
In this step, you load the data that you later use to create embeddings from it.
Use the msmarco-passagetest2019-top1000 data set, which is a subset of the MS MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by a list of relevant text passages. All unique passages, along with their IDs, have been extracted from that data set and compiled into a tsv file.
Download the file and upload it to your cluster using the Data Visualizer in the {{ml-app}} UI. After your data is analyzed, click Override settings. Under Edit field names*, assign id to the first column and content to the second. Click Apply, then *Import. Name the index test-data, and click Import. After the upload is complete, you will see an index named test-data with 182,469 documents.
Create the embeddings from the text by reindexing the data from the test-data index to the semantic-embeddings index. The data in the content field will be reindexed into the content semantic text field of the destination index. The reindexed data will be processed by the {{infer}} endpoint associated with the content semantic text field.
::::{note} This step uses the reindex API to simulate data ingestion. If you are working with data that has already been indexed, rather than using the test-data set, reindexing is required to ensure that the data is processed by the {{infer}} endpoint and the necessary embeddings are generated.
::::
POST _reindex?wait_for_completion=false
{
"source": {
"index": "test-data",
"size": 10 <1>
},
"dest": {
"index": "semantic-embeddings"
}
}- The default batch size for reindexing is 1000. Reducing size to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early.
The call returns a task ID to monitor the progress:
GET _tasks/<task_id>Reindexing large datasets can take a long time. You can test this workflow using only a subset of the dataset. Do this by cancelling the reindexing process, and only generating embeddings for the subset that was reindexed. The following API request will cancel the reindexing task:
POST _tasks/<task_id>/_cancelAfter the data has been indexed with the embeddings, you can query the data using semantic search. Choose between Query DSL or {{esql}} syntax to execute the query.
::::{tab-set} :group: query-type
:::{tab-item} Query DSL :sync: dsl
The Query DSL approach uses the semantic query type with the semantic_text field:
GET semantic-embeddings/_search
{
"query": {
"semantic": {
"field": "content", <1>
"query": "What causes muscle soreness after running?" <2>
}
}
}
- The
semantic_textfield on which you want to perform the search. - The query text. :::
:::{tab-item} ES|QL :sync: esql
The ES|QL approach uses the match (:) operator, which automatically detects the semantic_text field and performs the search on it. The query uses METADATA _score to sort by _score in descending order.
POST /_query?format=txt
{
"query": """
FROM semantic-embeddings METADATA _score <1>
| WHERE content: "How to avoid muscle soreness while running?" <2>
| SORT _score DESC <3>
| LIMIT 1000 <4>
"""
}- The
METADATA _scoreclause is used to return the score of each document - The match (
:) operator is used on thecontentfield for standard keyword matching - Sorts by descending score to display the most relevant results first
- Limits the results to 1000 documents
::: ::::
- If you want to use
semantic_textin hybrid search, refer to this notebook for a step-by-step guide. - For more information on how to optimize your ELSER endpoints, refer to the ELSER recommendations section in the model documentation.
- To learn more about model autoscaling, refer to the trained model autoscaling page.