Skip to content

FlashSQL/io-optimized-pgvector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

IO-optimized-pgvector

Overview

This repository provides an optimized implementation of pgvector with SSD-aware enhancements for high-performance vector search. The optimizations focus on improving I/O efficiency, leveraging SSD parallelism, query reordering, and locality-preserving indexing.

Branches

This repository consists of 6 branches:

  • main: Includes README file.
  • pg_orig: Vanilla (original) PostgreSQL + pgvector.
  • pg_iou: Asynchronous I/O using IOUring.
  • pg_async_iou: Asynchronous I/O + overlapping execution.
  • pg_colocation: Partitioning + locality-aware insertion.
  • pg_async_iou_colocation: pg_async_iou + pg_colocation

Prerequisites

To install the required dependencies, run:

apt install zlib1g-dev flex bison libreadline-dev gdb rsync liburing-dev

PostgreSQL & pgvector

This project builds upon the following open-source projects:

ANN-Benchmark

For benchmarking ANN search performance, we use:

Datasets

The following datasets are used for evaluation:

How to Benchmark

To reproduce the benchmark experiments:

  1. Set up PostgreSQL with pgvector following the installation guide with the desired branch.
  2. Prepare datasets using the links above.
  3. Run ANN benchmarks to evaluate approximate nearest neighbor (ANN) search performance.
  4. Run CREATE INDEX command to measure indexing performance.

About

IO Optimized pgvector

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published