You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Couchbase and Mistral AI integration example notebook
28
+
# Introduction
29
+
29
30
Couchbase is a NoSQL distributed document database (JSON) with many of the best features of a relational DBMS: SQL, distributed ACID transactions, and much more. [Couchbase Capella™](https://cloud.couchbase.com/sign-up) is the easiest way to get started, but you can also download and run [Couchbase Server](http://couchbase.com/downloads) on-premises.
30
31
31
32
Mistral AI is a research lab building the best open source models in the world. La Plateforme enables developers and enterprises to build new products and applications, powered by Mistral’s open source and commercial LLMs.
@@ -41,37 +42,39 @@ The [Mistral AI APIs](https://console.mistral.ai/) empower LLM applications via:
41
42
-[Guardrailing](https://docs.mistral.ai/capabilities/guardrailing/), enables developers to enforce policies at the system level of Mistral models
42
43
43
44
44
-
# Prerequisites
45
-
## Python3 and PIP
46
-
Please consult with [pip installation documentation](https://pip.pypa.io/en/stable/installation/) to install pip.
47
-
## Dependency Libraries
48
-
This tutorial depends on `couchbase` and `mistralai` libraries. Run this shell command to install them:
49
-
```shell
50
-
pip install -r requirements.txt
51
-
```
52
-
## Couchbase Cluster
53
-
In order to run this tutorial, you will need access to a collection on a Couchbase Cluster either through Couchbase Capella or by running it locally. Please provide your couchbase cluster connection information by running the code block below:
45
+
# How to run this tutorial
54
46
47
+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/mistralai/mistralai.ipynb).
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
65
50
66
-
Cluster URL: localhost
67
-
Couchbase username: Administrator
68
-
Couchbase password: ········
69
-
Couchbase bucket: mistralai
70
-
Couchbase scope: _default
71
-
Couchbase collection: mistralai
51
+
# Before you start
52
+
53
+
## Get Credentials for Mistral AI
54
+
55
+
Please follow the [instructions](https://console.mistral.ai/api-keys/) to generate the Mistral AI credentials.
56
+
57
+
## Create and Deploy Your Free Tier Operational cluster on Capella
58
+
59
+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
60
+
61
+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
62
+
63
+
### Couchbase Capella Configuration
64
+
65
+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
66
+
67
+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
68
+
*[Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
72
69
70
+
# Install necessary libraries
73
71
74
-
## Imports
72
+
73
+
```python
74
+
!pip install couchbase mistralai
75
+
```
76
+
77
+
# Imports
75
78
76
79
77
80
```python
@@ -88,7 +91,29 @@ from couchbase.vector_search import VectorQuery, VectorSearch
In order to store Mistral embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial in the `fts_index.json` file. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).
136
+
# Creating Couchbase Vector Search Index
137
+
In order to store Mistral embeddings onto a Couchbase Cluster, a vector search index needs to be created first. We included a sample index definition that will work with this tutorial in the `mistralai_index.json` file. The definition can be used to create a vector index using Couchbase server web console, on more information on vector indexes, please read [Create a Vector Search Index with the Server Web Console](https://docs.couchbase.com/server/current/vector-search/create-vector-search-index-ui.html).
A Mistral API key needs to be obtained and configured in the code before using the Mistral API. A trial key can be obtained for free in MistralAI personal cabinet. For more detailed instructions on obtaining a key please consult with [Mistral documentation site](https://docs.mistral.ai/).
145
+
# Mistral Connection
122
146
123
147
124
148
```python
125
149
MISTRAL_API_KEY= getpass.getpass("Mistral API Key:")
126
150
mistral_client = Mistral(api_key=MISTRAL_API_KEY)
127
151
```
128
152
129
-
##Embedding Documents
153
+
# Embedding Documents
130
154
Mistral client can be used to generate vector embeddings for given text fragments. These embeddings represent the sentiment of corresponding fragments and can be stored in Couchbase for further retrieval. A custom embedding text can also be added into the embedding texts array by running this code block:
131
155
132
156
@@ -156,7 +180,7 @@ EmbeddingResponse(
156
180
)
157
181
```
158
182
159
-
##Storing Embeddings in Couchbase
183
+
# Storing Embeddings in Couchbase
160
184
Each embedding needs to be stored as a couchbase document. According to provided search index, embedding vector values need to be stored in the `vector` field. The original text of the embedding can be stored in the same document:
161
185
162
186
@@ -170,7 +194,7 @@ for i in range(0, len(texts)):
170
194
collection.upsert(doc["id"], doc)
171
195
```
172
196
173
-
##Searching For Embeddings
197
+
# Searching For Embeddings
174
198
Stored in Couchbase embeddings later can be searched using the vector index to, for example, find text fragments that would be the most relevant to some user-entered prompt:
0 commit comments