Skip to content

No QPS Improvement When Reducing Number of Segments from 16 to 2 Segments #1659

@MostafaDabbagh

Description

@MostafaDabbagh

Hi Qdrant team,

I'm running performance benchmarks on Qdrant using a LAION dataset with 10 million vectors. I wanted to report an issue (or unexpected behavior) regarding QPS when reducing the number of segments.

Benchmark Setup:
Qdrant version: qdrant:v1.14.0-gpu-nvidia docker image
Deployment: Docker container with access to 16 CPU cores and 64 GB RAM
Distance metric: Cosine
Quantization: Not used
Search params: top_k = 10
Client load: 1000 queries using Python's ThreadPoolExecutor with 10 threads
Index configuration: m=32, index; tested with both 16 segments and 2 segments

Observations:
With 16 segments, the system achieves ~200 QPS, and all 16 CPU cores are utilized during the benchmark.
With 2 segments, the QPS remains roughly the same (~200), but only 2–4 CPU cores are utilized during the benchmark.
I also tested 8 segments but the QPS remained the same
I tested QDRANT__STORAGE__PERFORMANCE__ASYNC_SCORER=true config but no enhancement observed.

Concern:
Although Qdrant documentation and community guidance suggest that fewer segments should improve performance, in my case reducing the segment count had no gain in QPS.
In addition, not the entire CPU resources are being utilized and there is something like a bottleneck somewhere.

Questions:
Is this behavior expected under the current architecture?
Can you recommend any solution to use the whole CPU power in searches to lead to a higher QPS?

Thanks for your great work on Qdrant!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions