Create this code with Python 3.11 version
User records the question --> Whisper generates text from user speech --> sends it to Conversational RAG --> checks if the chroma vector database is created or not, if not, makes the chroma database with embedding models dependent on your Chat models (openAI or hugging face) --> answers user question --> sends the answer to TTS
- Langchain : For conversational RAG
- Chroma : For the vector database
- OpenAI : as a chat model (optional, you can completely just use HuggingFace)
- Hugginface :
- Mistral 7B chat as a chat model
- Whisper as Speech-To-Text models
- Bark as Text-To-Speech models
- Install the library and packages we need (for pytorch, better you exclude it and install the version that supports your device)
pip install -r requirements.txt
-
Store your OpenAI API and Hugging Face tokens in .envs and store your data (for now, only support .txt) to folders call data
-
(Optional) Run Test code (test_conv_rag.py and test_tts_sst.py)
-
Use this syntax to run the app.py
streamlit run app.py
Steps 3 and 4 will create another folder to save our Chroma vector database locally.
- By default, we use Huggingface models, but if you plan to use OpenAI in conversational RAG, change this code in app.py :
to
model_source="openai",
embedding_source="openai",
- By default, I use quantization for loading the Mistral model from Huggingface. You can delete it and change the models with what you like (deepseek, llama, etc), but I recommend using chat models, not base models.
- Include machine translation to support any languages (right now, just supports English)
- Enable question filtering (currently it is already in the conversational RAG code, but I turned it off because it is less effective, for example, the first question is what is Alara?. and the code has answered it, then the second question is how to use it?. Then my current filtering will ignore that question.)