|
| 1 | +--- |
| 2 | +navigation_title: Jina |
| 3 | +applies_to: |
| 4 | + stack: preview 9.3 |
| 5 | + serverless: preview |
| 6 | +products: |
| 7 | + - id: machine-learning |
| 8 | +--- |
| 9 | + |
| 10 | +# Jina models [ml-nlp-jina] |
| 11 | + |
| 12 | +This page collects all Jina models you can use as part of the {{stack}}. |
| 13 | +Currently, the following models are available as built-in models: |
| 14 | + |
| 15 | +* [`jina-embeddings-v3`](#jina-embeddings-v3) |
| 16 | + |
| 17 | +## `jina-embeddings-v3` [jina-embeddings-v3] |
| 18 | + |
| 19 | +The [`jina-embeddings-v3`](https://jina.ai/models/jina-embeddings-v3/) is a multilingual dense vector embedding model that you can use through the [Elastic {{infer-cap}} Service (EIS)](/explore-analyze/elastic-inference/eis.md). |
| 20 | +It provides long-context embeddings across a wide range of languages without requiring you to configure, download, or deploy any model artifacts yourself. |
| 21 | +As the model runs on EIS, Elastic's own infrastructure, no ML node scaling and configuration is required to use it. |
| 22 | + |
| 23 | +The `jina-embedings-v3` model supports input lengths of up to 8192 tokens and produces 1024-dimension embeddings by default. It uses task-specific adapters to optimize embeddings for different use cases (such as retrieval or classification), and includes support for Matryoshka Representation Learning, which allows you to truncate embeddings to fewer dimensions with minimal loss in quality. |
| 24 | + |
| 25 | +### Dense vector embeddings |
| 26 | + |
| 27 | +Dense vector embeddings are fixed-length numerical representations of text. When you send text to an EIS {{infer}} endpoint that uses `jina-embeddings-v3`, the model returns a vector of floating-point numbers (for example, 1024 values). Texts that are semantically similar have embeddings that are close to each other in this vector space. {{es}} stores these vectors in [`dense_vector`](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md) fields or through the [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) type and uses vector similarity search to retrieve the most relevant documents for a given query. Unlike [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md), which expands text into sparse token-weight vectors, this model produces compact dense vectors that are well suited for multilingual and cross-domain use cases. |
| 28 | + |
| 29 | +### Requirements [jina-embeddings-v3-req] |
| 30 | + |
| 31 | +To use `jina-embeddings-v3`, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level or the trial period activated. |
| 32 | + |
| 33 | +### Getting started with `jina-embeddings-v3` via the Elastic {{infer-cap}} Service |
| 34 | + |
| 35 | +Create an {{infer}} endpoint that references the `jina-embeddings-v3` model in the `model_id` field. |
| 36 | + |
| 37 | +```console |
| 38 | +PUT _inference/text_embedding/eis-jina-embeddings-v3 |
| 39 | +{ |
| 40 | + "service": "elastic", |
| 41 | + "service_settings": { |
| 42 | + "model_id": "jina-embeddings-v3" |
| 43 | + } |
| 44 | +} |
| 45 | +``` |
| 46 | + |
| 47 | +The created {{infer}} endpoint uses the model for {{infer}} operations on the Elastic {{infer-cap}} Service. You can reference the `inference_id` of the endpoint in text_embedding {{infer}} tasks or search queries. |
| 48 | +For example, the following API request ingests the input text and produce embeddings. |
| 49 | + |
| 50 | +```console |
| 51 | +POST _inference/text_embedding/eis-jina-embeddings-v3 |
| 52 | +{ |
| 53 | + "input": "The sky above the port was the color of television tuned to a dead channel.", |
| 54 | + "input_type": "ingest" |
| 55 | +} |
| 56 | +``` |
| 57 | + |
| 58 | +### Performance considerations [jina-embeddings-v3-performance] |
| 59 | + |
| 60 | +* `jina-embeddings-v3` works best on small, medium or large sized fields that contain natural language. |
| 61 | +For connector or web crawler use cases, this aligns best with fields like title, description, summary, or abstract. |
| 62 | +Although `jina-embeddings-v3` has a context window of 8192 tokens, it's best to limit the input to 2048-4096 tokens for optimal performance. |
| 63 | +For larger fields that exceed this limit - for example, `body_content` on web crawler documents - consider chunking the content into multiple values, where each chunk can be under 4096 tokens. |
| 64 | +* Larger documents take longer at ingestion time, and {{infer}} time per document also increases the more fields in a document that need to be processed. |
| 65 | +* The more fields your pipeline has to perform {{infer}} on, the longer it takes per document to ingest. |
0 commit comments