This repository allows you to create and search a vector database for relevant context across a wide variety of documents and then get a response from the large language model that's more accurate. This is commonly referred to as "retrieval augmented generation" (RAG) and it drastically reduces hallucinations from the LLM! You can watch an introductory Video or read a Medium article about the program.
🐍 Python 3.11 or Python 3.12 • 📁 Git • 📁 Git LFS • 🌐 Pandoc • 🛠️ Compiler |
---|
The above link downloads Visual Studio as an example. Make sure to install the required SDKs, however.
Download the latest "release," extract its contents, and open the "src" folder:
- NOTE: If you clone this repository you will get the development version, which may or may not be stable.
Within the src
folder, create a virtual environment:
python -m venv .
Activate the virtual environment:
.\Scripts\activate
Run the setup script:
Only
Windows
is supported for now.
python setup_windows_.py
🔥Important🔥
- Instructions on how to use this program are being consolidated into the
Ask Jeeves
functionality, which can be accessed from the "Ask Jeeves" menu option. Please post an issue in this repository if Jeeves is not giving you sufficient answers. - To talk with Jeeves, you must first download the
bge-small-en-v1.5
embedding model from theModels Tab
.
Every time you want to use the program you must activate the virtual environment:
.\Scripts\activate
python gui.py
- Download a vector/embedding model from the
Models Tab
.
Non-audio files (including images) can be selected by clicking the Choose Files
button within the Create Database Tab
.
It is highly recommended that you test out the different vision models before inputting images, however. Ask Jeeves!
Audio files can be put into a vector database by first transcribing them from the Tools Tab
using advanced Whisper
models. You can only transcribe one audio file at a time, but batch processing is hopefully coming soon.
I highly recommend testing the various
Whisper
model sizes, precisions, and thebatch
setting on a short audio file before committng to transcribing a longer file. This will ensure that you do not run out of VRAM. Ask Jeeves!
A completed transcription will appear in the Create Database Tab
as a .json
file having the same name as the original audio file. Just doubleclick to see the transcription.
- Download a vector model from the
Models
tab. - Assuming you have added all the files you want, simply click the
Create Vector Database
button within theCreate Database Tab
.
- In the
Query Database Tab
, select the database you want to search. - Type or voice-record your question.
- Use the
chunks only
checkbox to only receive the relevant contexts. - Select a backend:
Local Models
,Kobold
,LM Studio
orChatGPT
. - Click
Submit Question
.- In the
Settings
tab, you can change multiple settings regarding querying the database.
- In the
🔥Important🔥
If you use either the Kobold
or LM Studio
backends you must be familiar with those programs. For example, LM Studio
must be running in "server mode" and handles the prompt formatting. In contrast,Kobold
defaults to creating a server but requires you to manually enter the prompt formatting. This program no longer provides detailed instructions on how to use either of these two backends but you can Ask Jeeves about them generally.
- In the
Manage Databases Tab
, select a database and clickDelete Database
.
Feel free to report bugs or request enhancements by creating an issue on github and I will respond promptly.
I welcome all suggestions - both positive and negative. You can e-mail me directly at "[email protected]" or I can frequently be seen on the KoboldAI
Discord server (moniker is vic49
). I am always happy to answer any quesitons or discuss anything vector database related! (no formal affiliation with KoboldAI
).