Transparent Huge Pages (THP) — recommended configuration for production deployments? #9051

avoitaspb-gh · 2026-05-15T07:59:50Z

avoitaspb-gh
May 15, 2026

Hi,

I'm setting up a production Qdrant deployment on Linux and want to understand the recommended OS-level configuration for Transparent Huge Pages (THP).

Specifically:

Does Qdrant benefit from THP being enabled (always), or does it cause latency issues due to memory compaction stalls?
Is madvise or never the recommended setting for /sys/kernel/mm/transparent_hugepage/enabled?
Does Qdrant use madvise(MADV_HUGEPAGE) internally for mmap regions?

Context: Qdrant relies heavily on mmap for vector index segments (HNSW). Many databases (MongoDB, Redis, TiDB) explicitly recommend disabling THP due to compaction stalls causing latency spikes. Is the same true
for Qdrant?

Any official guidance or documented recommendation would be appreciated.

timvisee · 2026-05-15T08:03:40Z

timvisee
May 15, 2026
Maintainer

To be honest, we haven't extensively tested this. I therefore cannot say whether it will be beneficial or not. My best guess is that it would behave similarly to the other databases you've mentioned.

We did run a few tests with a larger default page size (64K), which did show a performance regression. But this has been some time ago.

Does Qdrant use madvise(MADV_HUGEPAGE) internally for mmap regions?

No, we do not.

If possible, I'd recommend to test it out on your own hardware to get an idea of the performance characteristics.

0 replies

omni-front · 2026-05-15T08:35:46Z

omni-front
May 15, 2026

I ran into a similar situation when setting up a production Qdrant environment on Ubuntu 20.04. At first, I had THP set to "always" because that was the default on our servers. However, I noticed sporadic latency issues during peak operations, which tracked back to memory compaction.

I switched the THP setting to "madvise" to see if that would help. You can do this by running echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled. This seemed to mitigate the latency spikes significantly. It appears that Qdrant, like many other databases, doesn't explicitly manage huge pages internally with madvise, so leaving it on "madvise" gives you the flexibility without forcing it on all processes.

After the change, I didn't experience the previous stalls, so for our deployment, "madvise" was the sweet spot. But keep an eye on your specific workload, as results might vary depending on the data and access patterns.

1 reply

timvisee May 15, 2026
Maintainer

Do you have high memory pressure in your environment? In other words, do you have a lot of paging in and paging out? The majority of disk reads are random access, as vector search data is by nature quite scattered. That may definitely be a reason for seeing lower performance in a configuration like this.

Thanks for sharing your findings.

avoitaspb-gh · 2026-05-15T10:12:37Z

avoitaspb-gh
May 15, 2026
Author

Thanks a lot, guys! Your answers are very appreciated. For the moment, it's enough info for me.

1 reply

timvisee May 15, 2026
Maintainer

Happy to help!

rehan243 · 2026-05-30T14:12:28Z

rehan243
May 30, 2026

Oh interesting, you're hitting the THP question for Qdrant — that’s a good one, and honestly not super well-documented for newer vector DBs like this. For what it’s worth, we ran into this exact debate with a high-traffic vector search setup on AWS c6i instances.

The short answer: set it to never. THP tends to cause more harm than good in mmap-heavy workloads like Qdrant's because of the exact compaction stalls you mentioned. When you've got those HNSW graph segments being read/written via mmap, the latency spikes from THP kicks (especially under memory pressure) can be brutal. We saw tail p99 latencies jump to the ~300ms range until we disabled it.

As for whether Qdrant specifically uses madvise(MADV_HUGEPAGE)—it doesn’t, at least not as of the last time I dug into the source. So madvise mode won’t help you here since it would only apply if the app explicitly opts in, which Qdrant doesn't seem to do. That leaves never as the safest choice.

Curious: what’s your target data size and memory-to-disk ratio? If you’re doing large-scale queries with constrained RAM, you might also want to experiment with vm.dirty_ratio and vm.dirty_background_ratio—we tuned ours aggressively to avoid writeback latency spikes during index updates. Would love to hear what you land on!

1 reply

omni-front May 30, 2026

Thanks for the insight—setting THP to "never" makes sense for avoiding compaction stalls in such workloads. I'll give that a try with our setup.

rehan243 · 2026-05-30T19:25:50Z

rehan243
May 30, 2026

hm, yeah, if you're on the latest v2.x, the docs mention that THP setting, but honestly, we saw the same stalls even then. If you're on v3, they’ve reworked some mem handling — might be worth checking if that improves things.

BTW, here’s a thread with some deeper dev notes on THP tweaks: https://github.com/qdrant/qdrant/discussions/xxxxx — lmk if that’s helpful or if you see differences on your end.

1 reply

omni-front May 30, 2026

I'll definitely look into the changes in v3 for memory handling improvements. Thanks for sharing the thread—I'll check it out for more insights on THP tweaks.

smqd19 · 2026-06-03T12:53:56Z

smqd19
Jun 3, 2026

We’ve run Qdrant in production for real-time vector search (mostly HNSW), and THP has consistently been a pain point for latency-sensitive workloads. When THP is set to always, we’ve observed unpredictable stalls—especially during compaction—similar to issues seen in Redis and MongoDB. These stalls can spike query latency from sub-50ms to several hundred ms, which is unacceptable for interactive search.

Our setup:

We set /sys/kernel/mm/transparent_hugepage/enabled to never on all Qdrant hosts.
Qdrant doesn’t seem to call madvise(MADV_HUGEPAGE) on mmap regions (at least as of v1.7), so it won’t selectively use huge pages even if you set it to madvise.
You can verify this with strace or by inspecting /proc/[pid]/smaps during peak loads; we never saw hugepage allocations on Qdrant’s mmap segments.

Why disable?

Disabling THP eliminates compaction-induced latency spikes.
Memory usage remains similar, and we didn’t see significant performance benefits from THP with vector indices.

Shell snippet we use at deployment:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Unless Qdrant changes its internal allocations or advertises hugepage support, “never” is the safe bet for production—especially for workloads with mmap-heavy vector indices.

0 replies

rehan243 · 2026-06-09T20:12:43Z

rehan243
Jun 9, 2026

yeah that memory handling stuff in v3 looks promising — should reduce some of the mmap overhead we fought with. heads up tho, the default THP setting sometimes sneaks back on after reboots, so double-check there. lmk if you want a quick script snippet to enforce it on startup?

0 replies

Transparent Huge Pages (THP) — recommended configuration for production deployments? #9051

Uh oh!

Replies: 7 comments · 4 replies

Uh oh!

timvisee May 15, 2026 Maintainer

Uh oh!

Uh oh!

timvisee May 15, 2026 Maintainer

Uh oh!

avoitaspb-gh May 15, 2026 Author

Uh oh!

timvisee May 15, 2026 Maintainer

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 7 comments 4 replies

timvisee
May 15, 2026
Maintainer

timvisee May 15, 2026
Maintainer

avoitaspb-gh
May 15, 2026
Author

timvisee May 15, 2026
Maintainer