LLMImageIndexer

LLMImageIndexer creates keywords and captions for images and puts them into the file's metadata using a local AI. No data leaves your computer during this process -- once the install and download of the model weights and KoboldCpp executable is completed the internet is not needed or used.

By storing the information in the file metadata the images can be moved, renamed, or copied without issue. The indexer can also be run multiple times on the same files and will not reprocess them unless directed to.

Uses the Qwen2-VL 2B model, a 2 billion parameter multimodal local large language model. It runs on your machine to recognize images and describe them and generate keywords. However, you can use any image model you like as long as it has weights in the "gguf" filetype and it has an appropriate "mmproj" image projector.

Features

Image Analysis: Utilizes a local AI model to generate a list of keywords and a caption for each image
Metadata Enhancement: Can automatically edit image metadata with generated tags
Local Processing: All processing is done locally on your machine
Multi-Format Support: Handles a wide range of image formats, including all major raw camera files
User-Friendly GUI: Includes a GUI and installer. Relies on Koboldcpp, a single executable, for all AI functionality
GPU Acceleration: Will use Apple Metal, Nvidia CUDA, or AMD (Vulkan) hardware if available to greatly speed inference
Cross-Platform: Supports Windows, macOS ARM, and Linux
Stop and Start Capability: Can stop and start without having to reprocess all the files again
One or Two Step Processing: Can do keywords and a simple caption in one step, or keywords and a detailed caption in two steps

Important Information

It is recommended to have a discrete graphics processor in your machine. Running this on CPU will be extremely slow.

This tool verifies keywords and de-pluralizes them using rules that apply to English. Using it to generate keywords in other languages may have strange results.

This tool operates directly on image file metadata. It will write to one or more of the following fields:

Subject
Any keyword field
Description
Identifier
Status

The "Status" and "Identifier" fields are used to track the processing state of images. The "Description" field is used for the image caption, and "Subject" or "Keyword" fields are used to hold keywords.

The use of the Identifier tag means you can manage your files and add new files, and run the tool as many times as you like without worrying about reprocessing the files that were previously keyworded by the tool.

Installation

Prerequisites

Python 3.8 or higher
KoboldCPP

A vision model is needed, but if you use the llmii-run.bat to open it, then the first time it is run it will download the Qwen2-VL 2B Q4_K_M gguf and F16 projector from Bartowski's repo on huggingface. If you don't want to use that, just open llmii-no-kobold.bat instead and open Koboldcpp.exe and load whatever model you like.

Windows Installation

Clone the repository or download the ZIP file and extract it
Install Python for Windows
Run llmii-run.bat and wait exiftool to install and KoboldCpp to download. When it is complete you must start the file again. If you called it from a terminal window you will need to close the windows and reopen it. It will then create a python environment and download the model weights

macOS Installation (including ARM)

Clone the repository or download the ZIP file and extract it
Install Python 3.7 or higher if not already installed. You can use Homebrew:
```
brew install python
```
Install ExifTool:
```
brew install exiftool
```
Run the script:
```
./llmii-run.sh
```
If KoboldCpp fails to run, open a terminal in the LLMImageIndexer folder:
```
xattr -cr koboldcpp-mac-arm64
chmod +x koboldcpp-mac-arm64
```

Linux Installation

Clone the repository or download and extract the ZIP file
Install Python 3.7 or higher if not already installed. Use your distribution's package manager, for example on Ubuntu:
```
sudo apt-get update
sudo apt-get install python3 python3-pip
```

Install ExifTool. On Ubuntu:

sudo apt-get install libimage-exiftool-perl

Run the script:
```
./llmii-run.sh
```
If KoboldCpp fails to run, open a terminal in the LLMImageIndexer folder:
```
chmod +x koboldcpp-linux-x64
```

For all platforms, the script will set up the Python environment, install dependencies, and download necessary model weights. This initial setup is performed only once and will take a few minutes depending on your download speed.

Usage

Launch the LLMImageIndexer GUI:
- On Windows: Run llmii-run.bat
- On macOS/Linux: Run ./llmii-run.sh
Ensure KoboldCPP is running. Wait until you see the following message in the KoboldCPP window:
```
Please connect to custom endpoint at http://localhost:5001
```
Configure the indexing settings in the GUI
Click "Run Image Indexer" to start the process
Monitor the progress in the output area of the GUI.

Configuration Options

Directory: Target image directory (includes subdirectories by default)
API URL: KoboldCPP API endpoint (change if running on another machine)
API Password: Set if required by your KoboldCPP setup
Caption Instruction: The instruction to use when generating a detailed caption
Write a detailed caption: Have the LLM describe the image in detail and set it in XMP:Description (at least doubles processing time). This will overwrite any existing caption in the image metadata
GenTokens: Amount of tokens for the LLM to use per generation
Don't crawl subdirectories: Disable scanning of subdirectories
Reprocess all files again: Will generate keywords and captions for all images files regardless of prior processing (checking this will include failed and orphan files in reprocessing)
Reprocess failed files: If a file was marked failed during prior processing, it will processed again. If this is unchecked, previously failed files are ignored
Reprocess orphan files: If the file was previously processed, and the llmii.json file in the root directory was deleted, or the image file was moved or renamed, if will be processed again. If this box is unchecked, previously processed files will be ignored, regardless of its existence in the database
Don't make backups: Before changing anything in a file, a backup will be made called "Filename.extension_original". If this box is checked, these files will not be created and the original file will be altered with no backup
Pretend mode: Simulate processing without writing to files or database
Clear existing keywords and captions and write new ones: If this is selected any keywords and captions that exist will be overwritten
Add to existing keywords: If this is selected then any keywords that exist will be appended to the new keywords, and any captions created during this process will be discarded. This is useful for running on image files already processed (with the reprocess all files box checked) to add more keywords to them

More Information and Troubleshooting

Consult the wiki for more information and troubleshooting steps.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgements

ExifTool for metadata manipulation
KoboldCPP for local AI processing
PyQt6 for the GUI framework
Fix Busted JSON and Json Repair for help with mangled JSON parsing

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
LICENSE		LICENSE
README.md		README.md
llmii-no-kobold.bat		llmii-no-kobold.bat
llmii-run.bat		llmii-run.bat
llmii-run.sh		llmii-run.sh
llmii.py		llmii.py
llmii_gui.py		llmii_gui.py
llmii_utils.py		llmii_utils.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMImageIndexer

Features

Important Information

Installation

Prerequisites

Windows Installation

macOS Installation (including ARM)

Linux Installation

Usage

Configuration Options

More Information and Troubleshooting

Contributing

License

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

jabberjabberjabber/LLavaImageTagger

Folders and files

Latest commit

History

Repository files navigation

LLMImageIndexer

Features

Important Information

Installation

Prerequisites

Windows Installation

macOS Installation (including ARM)

Linux Installation

Usage

Configuration Options

More Information and Troubleshooting

Contributing

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages