|
1 |
| -# Example langchain retriever |
2 |
| - |
3 |
| -This project demonstrates one approach for implementing a |
4 |
| -[langchain retriever](https://python.langchain.com/docs/modules/data_connection/) |
5 |
| -that allows for |
6 |
| -[Retrieval Augmented Generation (RAG)](https://python.langchain.com/docs/use_cases/question_answering/) |
7 |
| -to be supported via MarkLogic and the MarkLogic Python Client. This example uses the same data as in |
8 |
| -[the langchain RAG quickstart guide](https://python.langchain.com/docs/use_cases/question_answering/quickstart), |
9 |
| -but with the data having first been loaded into MarkLogic. |
10 |
| - |
11 |
| -**This is only intended as an example** of how easily a langchain retriever can be developed |
12 |
| -using the MarkLogic Python Client. The queries in this example are simple and naturally |
13 |
| -do not have any knowledge of how your data is modeled in MarkLogic. You are encouraged to use |
14 |
| -this as an example for developing your own retriever, where you can build a query based on a |
15 |
| -question submitted to langchain that fully leverages the indexes and data models in your MarkLogic |
16 |
| -application. Additionally, please see the |
17 |
| -[langchain documentation on splitting text](https://python.langchain.com/docs/modules/data_connection/document_transformers/). You may need to restructure your data so that you have a larger number of |
18 |
| -smaller documents in your database so that you do not exceed the limit that langchain imposes on how |
19 |
| -much data a retriever can return. |
20 |
| - |
21 |
| -# Setup |
22 |
| - |
23 |
| -To try out this project, use [docker-compose](https://docs.docker.com/compose/) to instantiate a new MarkLogic |
24 |
| -instance with port 8003 available (you can use your own MarkLogic instance too, just be sure that port 8003 |
25 |
| -is available): |
26 |
| - |
27 |
| - docker-compose up -d --build |
28 |
| - |
29 |
| -Then deploy a small REST API application to MarkLogic, which includes a basic non-admin MarkLogic user |
30 |
| -named `langchain-user`: |
31 |
| - |
32 |
| - ./gradlew -i mlDeploy |
33 |
| - |
34 |
| -Next, create a new Python virtual environment - [pyenv](https://github.com/pyenv/pyenv) is recommended for this - |
35 |
| -and install the |
36 |
| -[langchain example dependencies](https://python.langchain.com/docs/use_cases/question_answering/quickstart#dependencies), |
37 |
| -along with the MarkLogic Python Client: |
38 |
| - |
39 |
| - pip install -U langchain langchain_openai langchain-community langchainhub openai chromadb bs4 marklogic_python_client |
40 |
| - |
41 |
| -Then run the following Python program to load text data from the langchain quickstart guide |
42 |
| -into two different collections in the `langchain-test-content` database: |
43 |
| - |
44 |
| - python load_data.py |
45 |
| - |
46 |
| -Create a ".env" file to hold your OpenAI API key: |
47 |
| - |
48 |
| - echo "OPENAI_API_KEY=<your key here>" > .env |
49 |
| - |
50 |
| -# Testing the retriever |
51 |
| - |
52 |
| -You are now ready to test the example retriever. Run the following to ask a question with the |
53 |
| -results augmented via the `marklogic_retriever.py` module in this project; you will be |
54 |
| -prompted for an OpenAI API key when you run this, which you can type or paste in: |
55 |
| - |
56 |
| - python ask.py "What is task decomposition?" posts |
57 |
| - |
58 |
| -The retriever uses a [cts.similarQuery](https://docs.marklogic.com/cts.similarQuery) to select from the documents |
59 |
| -loaded via `load_data.py`. It defaults to a page length of 10. You can change this by providing a command line |
60 |
| -argument - e.g.: |
61 |
| - |
62 |
| - python ask.py "What is task decomposition?" posts 15 |
63 |
| - |
64 |
| -Example of a question for the "sotu" (State of the Union speech) collection: |
65 |
| - |
66 |
| - python ask.py "What are economic sanctions?" sotu 20 |
67 |
| - |
68 |
| -To use a word query instead of a similar query, along with a set of drop words, specify "word" as the 4th argument: |
69 |
| - |
70 |
| - python ask.py "What are economic sanctions?" sotu 20 word |
| 1 | +This example project has been moved to the [MarkLogic AI examples repository](https://github.com/marklogic/marklogic-ai-examples). |
0 commit comments