Skip to content

Commit ec8f069

Browse files
authored
new: update readme for CUDA 12.x, add warning for version conflicts (#239)
* new: update readme for CUDA 12.x, add warning about onnxruntime-gpu and cuda compatibility * fix: change warning type * new: update readme
1 parent cbe0010 commit ec8f069

File tree

2 files changed

+32
-5
lines changed

2 files changed

+32
-5
lines changed

README.md

+20-4
Original file line numberDiff line numberDiff line change
@@ -50,17 +50,33 @@ len(embeddings_list[0]) # Vector of 384 dimensions
5050

5151
### ⚡️ FastEmbed on a GPU
5252

53-
FastEmbed supports running on GPU devices. It requires installation of the `fastembed-gpu` package.
54-
Make sure not to have the `fastembed` package installed, as it might interfere with the `fastembed-gpu` package.
53+
FastEmbed supports running on GPU devices.
54+
It requires installation of the `fastembed-gpu` package.
5555

5656
```bash
5757
pip install fastembed-gpu
58-
```
58+
```
59+
60+
*IMPORTANT*: Make sure not to have the `fastembed` package installed, as it interferes with the `fastembed-gpu` package.
61+
62+
By default, `fastembed` is shipped with `onnxruntime-gpu` compiled for CUDA 11.8.
63+
64+
CUDA 12.x requires `onnxruntime-gpu` to be installed with the following command:
65+
66+
```bash
67+
pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
68+
```
69+
70+
*Note*: It is better to install it before `fastembed-gpu`, otherwise it might be required to uninstall `onnxruntime-gpu` first.
71+
5972

6073
```python
6174
from fastembed import TextEmbedding
6275

63-
embedding_model = TextEmbedding(model_name="BAAI/bge-small-en-v1.5", providers=["CUDAExecutionProvider"])
76+
embedding_model = TextEmbedding(
77+
model_name="BAAI/bge-small-en-v1.5",
78+
providers=["CUDAExecutionProvider"]
79+
)
6480
print("The model BAAI/bge-small-en-v1.5 is ready to use on a GPU.")
6581

6682
```

fastembed/common/onnx_model.py

+12-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
from pathlib import Path
22
from typing import Any, Dict, Generic, Iterable, Optional, Tuple, Type, TypeVar, Sequence
3+
import warnings
34

45
import numpy as np
56
import onnxruntime as ort
@@ -39,14 +40,15 @@ def load_onnx_model(
3940
providers: Optional[Sequence[OnnxProvider]] = None,
4041
) -> None:
4142
model_path = model_dir / model_file
42-
4343
# List of Execution Providers: https://onnxruntime.ai/docs/execution-providers
4444

4545
onnx_providers = ["CPUExecutionProvider"] if providers is None else list(providers)
4646
available_providers = ort.get_available_providers()
47+
requested_provider_names = []
4748
for provider in onnx_providers:
4849
# check providers available
4950
provider_name = provider if isinstance(provider, str) else provider[0]
51+
requested_provider_names.append(provider_name)
5052
if provider_name not in available_providers:
5153
raise ValueError(
5254
f"Provider {provider_name} is not available. Available providers: {available_providers}"
@@ -62,6 +64,15 @@ def load_onnx_model(
6264
self.model = ort.InferenceSession(
6365
str(model_path), providers=onnx_providers, sess_options=so
6466
)
67+
if "CUDAExecutionProvider" in requested_provider_names:
68+
current_providers = self.model.get_providers()
69+
if "CUDAExecutionProvider" not in current_providers:
70+
warnings.warn(
71+
f"Attempt to set CUDAExecutionProvider failed. Current providers: {current_providers}."
72+
"If you are using CUDA 12.x, install onnxruntime-gpu via "
73+
"`pip install onnxruntime-gpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/`",
74+
RuntimeWarning,
75+
)
6576

6677
def onnx_embed(self, *args, **kwargs) -> Tuple[np.ndarray, np.ndarray]:
6778
raise NotImplementedError("Subclasses must implement this method")

0 commit comments

Comments
 (0)