Skip to content

Commit c33c0a9

Browse files
authored
docs: Add text embeddings benchmarks (#289)
## Description Changes: 1. Add text embedding models benchmarks 2. Add new text embedding models to docs 3. Add text embedding models descriptions ### Type of change - [ ] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [x] Documentation update (improves or adds clarity to existing documentation) ### Checklist - [x] I have performed a self-review of my code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have updated the documentation accordingly - [x] My changes generate no new warnings
1 parent fe30da9 commit c33c0a9

File tree

5 files changed

+44
-21
lines changed

5 files changed

+44
-21
lines changed

.cspell-wordlist.txt

+2
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,5 @@ sublabel
4747
Aeonik
4848
Lexend
4949
finetuned
50+
MINILM
51+
MPNET

docs/docs/benchmarks/inference-time.md

+6-3
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,9 @@ Average time for decoding one token in sequence of 100 tokens, with encoding con
102102

103103
## Text Embeddings
104104

105-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
106-
| ---------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
107-
| ALL_MINILM_L6_V2 | 105 | 126 | 151 | 165 | 152 |
105+
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
106+
| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
107+
| ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
108+
| ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
109+
| MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
110+
| MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |

docs/docs/benchmarks/memory-usage.md

+6-3
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,9 @@ sidebar_position: 2
5757

5858
## Text Embeddings
5959

60-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
61-
| ---------------- | :--------------------: | :----------------: |
62-
| ALL_MINILM_L6_V2 | 140 | 64 |
60+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
61+
| -------------------------- | :--------------------: | :----------------: |
62+
| ALL_MINILM_L6_V2 | 150 | 190 |
63+
| ALL_MPNET_BASE_V2 | 520 | 470 |
64+
| MULTI_QA_MINILM_L6_COS_V1 | 160 | 225 |
65+
| MULTI_QA_MPNET_BASE_DOT_V1 | 540 | 500 |

docs/docs/benchmarks/model-size.md

+6-3
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ sidebar_position: 1
6666

6767
## Text Embeddings
6868

69-
| Model | XNNPACK [MB] |
70-
| ---------------- | :----------: |
71-
| ALL_MINILM_L6_V2 | 91 |
69+
| Model | XNNPACK [MB] |
70+
| -------------------------- | :----------: |
71+
| ALL_MINILM_L6_V2 | 91 |
72+
| ALL_MPNET_BASE_V2 | 438 |
73+
| MULTI_QA_MINILM_L6_COS_V1 | 91 |
74+
| MULTI_QA_MPNET_BASE_DOT_V1 | 438 |

docs/docs/natural-language-processing/useTextEmbeddings.md

+24-12
Original file line numberDiff line numberDiff line change
@@ -110,9 +110,12 @@ function App() {
110110

111111
## Supported models
112112

113-
| Model | Language | Max Tokens | Embedding Dimensions |
114-
| --------------------------------------------------------------------------------- | :------: | :--------: | :------------------: |
115-
| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) | English | 256 | 384 |
113+
| Model | Language | Max Tokens | Embedding Dimensions | Description |
114+
| ----------------------------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
115+
| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) | English | 256 | 384 | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs. |
116+
| [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | English | 384 | 768 | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs. |
117+
| [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) | English | 511 | 384 | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs. |
118+
| [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) | English | 512 | 768 | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs. |
116119

117120
**`Max Tokens`** - the maximum number of tokens that can be processed by the model. If the input text exceeds this limit, it will be truncated.
118121

@@ -122,22 +125,31 @@ function App() {
122125

123126
### Model size
124127

125-
| Model | XNNPACK [MB] |
126-
| ---------------- | :----------: |
127-
| ALL_MINILM_L6_V2 | 91 |
128+
| Model | XNNPACK [MB] |
129+
| -------------------------- | :----------: |
130+
| ALL_MINILM_L6_V2 | 91 |
131+
| ALL_MPNET_BASE_V2 | 438 |
132+
| MULTI_QA_MINILM_L6_COS_V1 | 91 |
133+
| MULTI_QA_MPNET_BASE_DOT_V1 | 438 |
128134

129135
### Memory usage
130136

131-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
132-
| ---------------- | :--------------------: | :----------------: |
133-
| ALL_MINILM_L6_V2 | 140 | 64 |
137+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
138+
| -------------------------- | :--------------------: | :----------------: |
139+
| ALL_MINILM_L6_V2 | 150 | 190 |
140+
| ALL_MPNET_BASE_V2 | 520 | 470 |
141+
| MULTI_QA_MINILM_L6_COS_V1 | 160 | 225 |
142+
| MULTI_QA_MPNET_BASE_DOT_V1 | 540 | 500 |
134143

135144
### Inference time
136145

137146
:::warning warning
138147
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
139148
:::
140149

141-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
142-
| ---------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
143-
| ALL_MINILM_L6_V2 | 105 | 126 | 151 | 165 | 152 |
150+
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
151+
| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
152+
| ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
153+
| ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
154+
| MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
155+
| MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |

0 commit comments

Comments
 (0)