Skip to content

Commit 9a59b34

Browse files
szabostevedemjenedflorent-leborgne
authored
[E&A] Adds docs for jina-embeddings-v3 (#4056)
## Summary Related to elastic/docs-content-internal#528 This PR: * adds a new section under Built-in NLP Models called Jina * adds docs about the `jina-embeddings-v3` model to the new page * adds an `applies_to` tag to the model documentation section (tech preview in serverless, tech preview in 9.3 – it will be rendered as "Planned" on the doc page without explicit reference to a certain version number) ## TO DO - [x] Performance considerations ## Generative AI disclosure <!-- To help us ensure compliance with the Elastic open source and documentation guidelines, please answer the following: --> 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [ ] Yes - [x] No <!-- 2. If you answered "Yes" to the previous question, please specify the tool(s) and model(s) used (e.g., Google Gemini, OpenAI ChatGPT-4, etc.). Tool(s) and model(s) used: --> ## Notes Benchmarking info will be added in a separate PR. --------- Co-authored-by: Adam Demjen <[email protected]> Co-authored-by: florent-leborgne <[email protected]>
1 parent 8c078f8 commit 9a59b34

File tree

3 files changed

+67
-0
lines changed

3 files changed

+67
-0
lines changed

explore-analyze/machine-learning/nlp/ml-nlp-built-in-models.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ products:
1313
There are {{nlp}} models that are available for use in every cluster out-of-the-box. These models are pre-trained which means they don’t require fine-tuning on your own data, making it adaptable for various use cases out of the box. The following models are available:
1414

1515
* [ELSER](ml-nlp-elser.md) trained by Elastic
16+
* [Jina models](ml-nlp-jina.md)
1617
* [](ml-nlp-rerank.md)
1718
* [E5](ml-nlp-e5.md)
1819
* [{{lang-ident-cap}}](ml-nlp-lang-ident.md)
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
navigation_title: Jina
3+
applies_to:
4+
stack: preview 9.3
5+
serverless: preview
6+
products:
7+
- id: machine-learning
8+
---
9+
10+
# Jina models [ml-nlp-jina]
11+
12+
This page collects all Jina models you can use as part of the {{stack}}.
13+
Currently, the following models are available as built-in models:
14+
15+
* [`jina-embeddings-v3`](#jina-embeddings-v3)
16+
17+
## `jina-embeddings-v3` [jina-embeddings-v3]
18+
19+
The [`jina-embeddings-v3`](https://jina.ai/models/jina-embeddings-v3/) is a multilingual dense vector embedding model that you can use through the [Elastic {{infer-cap}} Service (EIS)](/explore-analyze/elastic-inference/eis.md).
20+
It provides long-context embeddings across a wide range of languages without requiring you to configure, download, or deploy any model artifacts yourself.
21+
As the model runs on EIS, Elastic's own infrastructure, no ML node scaling and configuration is required to use it.
22+
23+
The `jina-embedings-v3` model supports input lengths of up to 8192 tokens and produces 1024-dimension embeddings by default. It uses task-specific adapters to optimize embeddings for different use cases (such as retrieval or classification), and includes support for Matryoshka Representation Learning, which allows you to truncate embeddings to fewer dimensions with minimal loss in quality.
24+
25+
### Dense vector embeddings
26+
27+
Dense vector embeddings are fixed-length numerical representations of text. When you send text to an EIS {{infer}} endpoint that uses `jina-embeddings-v3`, the model returns a vector of floating-point numbers (for example, 1024 values). Texts that are semantically similar have embeddings that are close to each other in this vector space. {{es}} stores these vectors in [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) fields or through the [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) type and uses vector similarity search to retrieve the most relevant documents for a given query. Unlike [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), which expands text into sparse token-weight vectors, this model produces compact dense vectors that are well suited for multilingual and cross-domain use cases.
28+
29+
### Requirements [jina-embeddings-v3-req]
30+
31+
To use `jina-embeddings-v3`, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level or the trial period activated.
32+
33+
### Getting started with `jina-embeddings-v3` via the Elastic {{infer-cap}} Service
34+
35+
Create an {{infer}} endpoint that references the `jina-embeddings-v3` model in the `model_id` field.
36+
37+
```console
38+
PUT _inference/text_embedding/eis-jina-embeddings-v3
39+
{
40+
"service": "elastic",
41+
"service_settings": {
42+
"model_id": "jina-embeddings-v3"
43+
}
44+
}
45+
```
46+
47+
The created {{infer}} endpoint uses the model for {{infer}} operations on the Elastic {{infer-cap}} Service. You can reference the `inference_id` of the endpoint in text_embedding {{infer}} tasks or search queries.
48+
For example, the following API request ingests the input text and produce embeddings.
49+
50+
```console
51+
POST _inference/text_embedding/eis-jina-embeddings-v3
52+
{
53+
"input": "The sky above the port was the color of television tuned to a dead channel.",
54+
"input_type": "ingest"
55+
}
56+
```
57+
58+
### Performance considerations [jina-embeddings-v3-performance]
59+
60+
* `jina-embeddings-v3` works best on small, medium or large sized fields that contain natural language.
61+
For connector or web crawler use cases, this aligns best with fields like title, description, summary, or abstract.
62+
Although `jina-embeddings-v3` has a context window of 8192 tokens, it's best to limit the input to 2048-4096 tokens for optimal performance.
63+
For larger fields that exceed this limit - for example, `body_content` on web crawler documents - consider chunking the content into multiple values, where each chunk can be under 4096 tokens.
64+
* Larger documents take longer at ingestion time, and {{infer}} time per document also increases the more fields in a document that need to be processed.
65+
* The more fields your pipeline has to perform {{infer}} on, the longer it takes per document to ingest.

explore-analyze/toc.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@ toc:
123123
- file: machine-learning/nlp/ml-nlp-built-in-models.md
124124
children:
125125
- file: machine-learning/nlp/ml-nlp-elser.md
126+
- file: machine-learning/nlp/ml-nlp-jina.md
126127
- file: machine-learning/nlp/ml-nlp-rerank.md
127128
- file: machine-learning/nlp/ml-nlp-e5.md
128129
- file: machine-learning/nlp/ml-nlp-lang-ident.md

0 commit comments

Comments
 (0)