No QPS Improvement When Reducing Number of Segments from 16 to 2 Segments

Hi Qdrant team,

I'm running performance benchmarks on Qdrant using a LAION dataset with 10 million vectors. I wanted to report an issue (or unexpected behavior) regarding QPS when reducing the number of segments.

Benchmark Setup:
Qdrant version: qdrant:v1.14.0-gpu-nvidia docker image
Deployment: Docker container with access to 16 CPU cores and 64 GB RAM
Distance metric: Cosine
Quantization: Not used
Search params: top_k = 10
Client load: 1000 queries using Python's ThreadPoolExecutor with 10 threads
Index configuration: m=32, index; tested with both 16 segments and 2 segments

Observations:
With 16 segments, the system achieves ~200 QPS, and all 16 CPU cores are utilized during the benchmark.
With 2 segments, the QPS remains roughly the same (~200), but only 2–4 CPU cores are utilized during the benchmark.
I also tested 8 segments but the QPS remained the same
I tested QDRANT__STORAGE__PERFORMANCE__ASYNC_SCORER=true config but no enhancement observed.

Concern:
Although Qdrant documentation and community guidance suggest that fewer segments should improve performance, in my case reducing the segment count had no gain in QPS.
In addition, not the entire CPU resources are being utilized and there is something like a bottleneck somewhere.

Questions:
Is this behavior expected under the current architecture?
Can you recommend any solution to use the whole CPU power in searches to lead to a higher QPS?

Thanks for your great work on Qdrant!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No QPS Improvement When Reducing Number of Segments from 16 to 2 Segments #1659

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

No QPS Improvement When Reducing Number of Segments from 16 to 2 Segments #1659

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions