Skip to content

Commit e7d352a

Browse files
author
Mateusz Kopciński
committed
Merge branch 'v0.4.0-rc1' into @md/s2t_streaming
2 parents 135eb72 + a200f34 commit e7d352a

File tree

16 files changed

+99
-62
lines changed

16 files changed

+99
-62
lines changed

.cspell-wordlist.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,3 +47,5 @@ sublabel
4747
Aeonik
4848
Lexend
4949
finetuned
50+
MINILM
51+
MPNET

docs/docs/benchmarks/inference-time.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,9 @@ Average time for decoding one token in sequence of 100 tokens, with encoding con
102102

103103
## Text Embeddings
104104

105-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
106-
| ---------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
107-
| ALL_MINILM_L6_V2 | 105 | 126 | 151 | 165 | 152 |
105+
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
106+
| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
107+
| ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
108+
| ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
109+
| MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
110+
| MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |

docs/docs/benchmarks/memory-usage.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,9 @@ sidebar_position: 2
5757

5858
## Text Embeddings
5959

60-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
61-
| ---------------- | :--------------------: | :----------------: |
62-
| ALL_MINILM_L6_V2 | 140 | 64 |
60+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
61+
| -------------------------- | :--------------------: | :----------------: |
62+
| ALL_MINILM_L6_V2 | 150 | 190 |
63+
| ALL_MPNET_BASE_V2 | 520 | 470 |
64+
| MULTI_QA_MINILM_L6_COS_V1 | 160 | 225 |
65+
| MULTI_QA_MPNET_BASE_DOT_V1 | 540 | 500 |

docs/docs/benchmarks/model-size.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,9 @@ sidebar_position: 1
6666

6767
## Text Embeddings
6868

69-
| Model | XNNPACK [MB] |
70-
| ---------------- | :----------: |
71-
| ALL_MINILM_L6_V2 | 91 |
69+
| Model | XNNPACK [MB] |
70+
| -------------------------- | :----------: |
71+
| ALL_MINILM_L6_V2 | 91 |
72+
| ALL_MPNET_BASE_V2 | 438 |
73+
| MULTI_QA_MINILM_L6_COS_V1 | 91 |
74+
| MULTI_QA_MPNET_BASE_DOT_V1 | 438 |

docs/docs/natural-language-processing/useLLM.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,6 @@ interface LLMType {
7575
chatConfig?: Partial<ChatConfig>;
7676
toolsConfig?: ToolsConfig;
7777
}) => void;
78-
forward: (input: string) => Promise<void>;
7978
generate: (messages: Message[], tools?: LLMTool[]) => Promise<void>;
8079
sendMessage: (message: string) => Promise<void>;
8180
deleteMessage: (index: number) => void;
@@ -137,20 +136,19 @@ Given computational constraints, our architecture is designed to support only on
137136

138137
### Returns
139138

140-
| Field | Type | Description |
141-
| ------------------ | --------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
142-
| `messageHistory` | `Message[]` | History containing all messages in conversation. This field is updated after model responds to `sendMessage`. |
143-
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model. |
144-
| `isReady` | `boolean` | Indicates whether the model is ready. |
145-
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response. |
146-
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
147-
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
148-
| `configure` | `({ chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig }) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model). |
149-
| `sendMessage` | `(message: string, tools?: LLMTool[]) => Promise<void>` | Method to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
150-
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. |
151-
| `generate` | `(messages: Message[], tools?: LLMTool[]) => Promise<void>` | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context. |
152-
| `forward` | `(input: string) => Promise<void>` | Runs model inference with raw input string. You need to provide entire conversation and prompt (in correct format and with special tokens!) in input string to this method. It doesn't manage conversation context. It is intended for users that need access to the model itself without any wrapper. If you want simple chat with model consider using `sendMessage`. |
153-
| `interrupt` | `() => void` | Function to interrupt the current inference. |
139+
| Field | Type | Description |
140+
| ------------------ | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
141+
| `messageHistory` | `Message[]` | History containing all messages in conversation. This field is updated after model responds to `sendMessage`. |
142+
| `response` | `string` | State of the generated response. This field is updated with each token generated by the model. |
143+
| `isReady` | `boolean` | Indicates whether the model is ready. |
144+
| `isGenerating` | `boolean` | Indicates whether the model is currently generating a response. |
145+
| `downloadProgress` | `number` | Represents the download progress as a value between 0 and 1, indicating the extent of the model file retrieval. |
146+
| `error` | <code>string &#124; null</code> | Contains the error message if the model failed to load. |
147+
| `configure` | `({ chatConfig?: Partial<ChatConfig>, toolsConfig?: ToolsConfig }) => void` | Configures chat and tool calling. See more details in [configuring the model](#configuring-the-model). |
148+
| `sendMessage` | `(message: string, tools?: LLMTool[]) => Promise<void>` | Method to add user message to conversation. After model responds, `messageHistory` will be updated with both user message and model response. |
149+
| `deleteMessage` | `(index: number) => void` | Deletes all messages starting with message on `index` position. After deletion `messageHistory` will be updated. |
150+
| `generate` | `(messages: Message[], tools?: LLMTool[]) => Promise<void>` | Runs model to complete chat passed in `messages` argument. It doesn't manage conversation context. |
151+
| `interrupt` | `() => void` | Function to interrupt the current inference. |
154152

155153
## Configuring the model
156154

docs/docs/natural-language-processing/useTextEmbeddings.md

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -110,9 +110,12 @@ function App() {
110110

111111
## Supported models
112112

113-
| Model | Language | Max Tokens | Embedding Dimensions |
114-
| --------------------------------------------------------------------------------- | :------: | :--------: | :------------------: |
115-
| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) | English | 256 | 384 |
113+
| Model | Language | Max Tokens | Embedding Dimensions | Description |
114+
| ----------------------------------------------------------------------------------------------------- | :------: | :--------: | :------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
115+
| [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) | English | 256 | 384 | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs. |
116+
| [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) | English | 384 | 768 | All-round model tuned for many use-cases. Trained on a large and diverse dataset of over 1 billion training pairs. |
117+
| [multi-qa-MiniLM-L6-cos-v1](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1) | English | 511 | 384 | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs. |
118+
| [multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) | English | 512 | 768 | This model was tuned for semantic search: Given a query/question, it can find relevant passages. It was trained on a large and diverse set of (question, answer) pairs. |
116119

117120
**`Max Tokens`** - the maximum number of tokens that can be processed by the model. If the input text exceeds this limit, it will be truncated.
118121

@@ -122,22 +125,31 @@ function App() {
122125

123126
### Model size
124127

125-
| Model | XNNPACK [MB] |
126-
| ---------------- | :----------: |
127-
| ALL_MINILM_L6_V2 | 91 |
128+
| Model | XNNPACK [MB] |
129+
| -------------------------- | :----------: |
130+
| ALL_MINILM_L6_V2 | 91 |
131+
| ALL_MPNET_BASE_V2 | 438 |
132+
| MULTI_QA_MINILM_L6_COS_V1 | 91 |
133+
| MULTI_QA_MPNET_BASE_DOT_V1 | 438 |
128134

129135
### Memory usage
130136

131-
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
132-
| ---------------- | :--------------------: | :----------------: |
133-
| ALL_MINILM_L6_V2 | 140 | 64 |
137+
| Model | Android (XNNPACK) [MB] | iOS (XNNPACK) [MB] |
138+
| -------------------------- | :--------------------: | :----------------: |
139+
| ALL_MINILM_L6_V2 | 150 | 190 |
140+
| ALL_MPNET_BASE_V2 | 520 | 470 |
141+
| MULTI_QA_MINILM_L6_COS_V1 | 160 | 225 |
142+
| MULTI_QA_MPNET_BASE_DOT_V1 | 540 | 500 |
134143

135144
### Inference time
136145

137146
:::warning warning
138147
Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.
139148
:::
140149

141-
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
142-
| ---------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
143-
| ALL_MINILM_L6_V2 | 105 | 126 | 151 | 165 | 152 |
150+
| Model | iPhone 16 Pro (XNNPACK) [ms] | iPhone 14 Pro Max (XNNPACK) [ms] | iPhone SE 3 (XNNPACK) [ms] | Samsung Galaxy S24 (XNNPACK) | OnePlus 12 (XNNPACK) [ms] |
151+
| -------------------------- | :--------------------------: | :------------------------------: | :------------------------: | :--------------------------: | :-----------------------: |
152+
| ALL_MINILM_L6_V2 | 53 | 69 | 78 | 60 | 65 |
153+
| ALL_MPNET_BASE_V2 | 352 | 423 | 478 | 521 | 527 |
154+
| MULTI_QA_MINILM_L6_COS_V1 | 135 | 166 | 180 | 158 | 165 |
155+
| MULTI_QA_MPNET_BASE_DOT_V1 | 503 | 598 | 680 | 694 | 743 |

examples/speech-to-text/android/app/src/main/AndroidManifest.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
<manifest xmlns:android="http://schemas.android.com/apk/res/android">
22
<uses-permission android:name="android.permission.INTERNET"/>
33
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
4+
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
45
<uses-permission android:name="android.permission.SYSTEM_ALERT_WINDOW"/>
56
<uses-permission android:name="android.permission.VIBRATE"/>
67
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>

examples/speech-to-text/android/app/src/main/java/com/anonymous/speechtotext/MainActivity.kt

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ class MainActivity : ReactActivity() {
2727
* Returns the instance of the [ReactActivityDelegate]. We use [DefaultReactActivityDelegate]
2828
* which allows you to enable New Architecture with a single boolean flags [fabricEnabled]
2929
*/
30-
override fun createReactActivityDelegate(): ReactActivityDelegate {
31-
return ReactActivityDelegateWrapper(
30+
override fun createReactActivityDelegate(): ReactActivityDelegate =
31+
ReactActivityDelegateWrapper(
3232
this,
3333
BuildConfig.IS_NEW_ARCHITECTURE_ENABLED,
3434
object : DefaultReactActivityDelegate(
@@ -37,7 +37,6 @@ class MainActivity : ReactActivity() {
3737
fabricEnabled,
3838
) {},
3939
)
40-
}
4140

4241
/**
4342
* Align the back button behavior with Android S

0 commit comments

Comments
 (0)