Visual retrieval Open-Webui function (+vespa deploy and pdf file feed) via ColQwen2 over infinity embedding api and local vespa database Using concept:
- vespa (docker container)
- Vision language model - e.g. Qwen2-VL-7B (via openai compatible api)
Clone repo
Create python venv or conda env
pip install -r requirements.txt
Download "Vespa CLI" package for your platform here ->
- copy bin/vespa to /usr/bin/vespa
- Adjust permissions:
sudo chmod +x /usr/bin/vespa && sudo chmod 755 /usr/bin/vespa
Start vespa local docker container
docker run --detach --name vespa --hostname my-vespa-container --publish 8080:8080 --publish 19071:19071 vespaengine/vespa:latest
Start ColQwen2 via infinity embedding api like (only -merged versions work with infinity):
infinity_emb v2 --device cuda --no-bettertransformer --batch-size 16 --dtype float16 --model-id vidore/colqwen2-v1.0-merged --served-model-name colqwen2 --api-key sk-1111
Deploy vespa application schema to vespa instance (localhost)
python --vespa_application_name MyApplicationName
Feed pdf files from a folder to vespa with ColQwen2
python --application_name MyApplicationName --vespa_schema_name pdf_page --pdf_folder /path/to/my/pdf/files/
Use to test retrieval. Adjust "queries = [...]" in file as needed.
Import "" into Open-Webui as function
Configure Valves (Vespa DB host, etc.)
Currently written to work with the Colqwen infinity api behind litellm with a config like described here (BerriAI/litellm#6525 (comment)) Otherwise "extra_body: { "modality": "image/text" }" is needed.
litellm api cfg:
- model_name: colqwen2 litellm_params: model: "openai/vidore/colqwen2-v1.0-merged" api_base: "http://infinity-inference:7997" api_key: "sk-1111" extra_body: { "modality": "image" } model_info: id: "1" mode: "embedding" - model_name: colqwen2-text litellm_params: model: "openai/vidore/colqwen2-v1.0-merged" api_base: "http://infinity-inference:7997" api_key: "sk-1111" extra_body: { "modality": "text" } model_info: id: "2" mode: "embedding"