[server] Allocated a dedicated thread pool for ingestion DB lookup #1525

gaojieliu · 2025-02-12T17:11:49Z

Based on the prod experiment, the SSD being used is not performing well when enabling parallel processing for AA/WC workload as the database lookup concurrency will be roughly same as the concurrency of parallel processing and we saw a higher lookup latency with high cpu wait.
This PR introduces a separated thread pool for ingestion database lookup (value and RMD) and the concurrency can be controlled separately from the parallel processing concurrency, and based on a canary test, a lower concurrency of DB lookup can reduce cpu wait while delivering a slightly better overall throughput.

New Config:
server.aa.wc.ingestion.storage.lookup.thread.pool.size: default 4

How was this PR tested?

CI

Does this PR introduce any user-facing changes?

No. You can skip the rest of this section.
Yes. Make sure to explain your proposed changes and call out the behavior change.

Based on the prod experiment, the SSD being used is not performing well when enabling parallel processing for AA/WC workload as the database lookup concurrency will be roughly same as the concurrency of parallel processing and we saw a higher lookup latency with high cpu wait. This PR introduces a separated thread pool for ingestion database lookup (value and RMD) and the concurrency can be controlled separately from the parallel processing concurrency, and based on a canary test, a lower concurrency of DB lookup can reduce cpu wait while delivering a slightly better overall throughput. New Config: server.aa.wc.ingestion.storage.lookup.thread.pool.size: default 4

sushantmane · 2025-02-12T17:34:58Z

I haven’t reviewed this PR, but I want to share an idea we might want to consider. IMO, we should have a single common thread pool for RocksDB reads, used for both the read and write paths. This would allow better concurrency control: reads for writes can be made to have lower priority than user reads. This way, we can manage read concurrency more effectively without impacting regular reads.

gaojieliu · 2025-02-12T17:39:08Z

I haven’t reviewed this PR, but I want to share an idea we might want to consider. IMO, we should have a single common thread pool for RocksDB reads, used for both the read and write paths. This would allow better concurrency control: reads for writes can be made to have lower priority than user reads. This way, we can manage read concurrency more effectively without impacting regular reads.

The main goal of this PR is not trying to solve the conflict between read and write, but just write only.
There is another initiative to address the priority issue, which is called dynamic throttling since the read path can be affected by the whole ingestion pipeline, not just the ingestion read.
Even without this enhancement, the read latency with batch processing is comparable or slightly better than other nodes, and this is mainly to address the high cpu wait issue.

sushantmane · 2025-02-12T17:54:35Z

The main goal of this PR is not trying to solve the conflict between read and write, but just write only.

I understand that. However, these concerns are not mutually exclusive. We need to take a comprehensive look at the storage layer's IO impact. If we focus solely on the write path without considering the read path, the overall improvement may still be suboptimal.

These are just my thoughts—this isn't a blocker for this PR.

gaojieliu · 2025-02-12T17:58:08Z

The main goal of this PR is not trying to solve the conflict between read and write, but just write only.

I understand that. However, these concerns are not mutually exclusive. We need to take a comprehensive look at the storage layer's IO impact. If we focus solely on the write path without considering the read path, the overall improvement may still be suboptimal.

These are just my thoughts—this isn't a blocker for this PR.

Certainly, but another important aspect is to separate of the concerns, and different layers needs to be relatively independent and there will be a coordination layer, which will orchestrate the read and write, which is the dynamic throttling I was referring to in the previous comment.

sushantmane · 2025-02-12T18:43:57Z

Certainly, but another important aspect is to separate of the concerns, and different layers needs to be relatively independent and there will be a coordination layer, which will orchestrate the read and write, which is the dynamic throttling I was referring to in the previous comment.

We do't need co-ordination layer for the idea I was referring to.

gaojieliu · 2025-02-12T18:46:53Z

Certainly, but another important aspect is to separate of the concerns, and different layers needs to be relatively independent and there will be a coordination layer, which will orchestrate the read and write, which is the dynamic throttling I was referring to in the previous comment.

We do't need co-ordination layer for the idea I was referring to.

I think we do since the read/write path are not just competing IO resources, but all others, such as CPU, GC, network, and that is the holistic view we should look at instead of this narrow scope.
One example is that rebootstrapping can affect read perf even all the writes are full put.

sixpluszero · 2025-02-12T18:53:36Z

I haven’t reviewed this PR, but I want to share an idea we might want to consider. IMO, we should have a single common thread pool for RocksDB reads, used for both the read and write paths. This would allow better concurrency control: reads for writes can be made to have lower priority than user reads. This way, we can manage read concurrency more effectively without impacting regular reads.

Can you share how you going to control the priority of write-path read and regular read within single pool? My understanding is unless you control/throttle the issuer of the read (the consumer thread) you cannot directly control that right? And then if this is the case then it can also be achieved by separate pool then..

sushantmane · 2025-02-12T19:25:02Z

Can you share how you going to control the priority of write-path read and regular read within single pool? My understanding is unless you control/throttle the issuer of the read (the consumer thread) you cannot directly control that right? And then if this is the case then it can also be achieved by separate pool then..

All read requests to RocksDB will be routed through the storage executor thread pool, which will be updated to use a priority-based queue for scheduling tasks. Requests from the read path will be assigned a higher priority than those triggered by ingestion, with additional safeguards to prevent ingestion reads from being indefinitely starved. This approach implicitly throttles writes, as ingestion will be blocked until queued reads are not finished.

sixpluszero · 2025-02-12T19:31:34Z

All read requests to RocksDB will be routed through the storage executor thread pool, which will be updated to use a priority-based queue for scheduling tasks. Requests from the read path will be assigned a higher priority than those triggered by ingestion, with additional safeguards to prevent ingestion reads from being indefinitely starved. This approach implicitly throttles writes, as ingestion will be blocked until queued reads are not finished.

It can be promising, but I have a concern of starving the write-path read though. Maybe if we configure the thread pool large enough then we can avoid that.

sushantmane · 2025-02-12T19:36:18Z

It can be promising, but I have a concern of starving the write-path read though. Maybe if we configure the thread pool large enough then we can avoid that.

Could you please elaborate? Do you think time-based capping (or max delay) won't prevent starving?

gaojieliu force-pushed the make_ingestion_lookup_concurrency_configurable branch from 1e89de6 to a23ae99 Compare February 12, 2025 17:25

sixpluszero approved these changes Feb 12, 2025

View reviewed changes

gaojieliu merged commit bb092b9 into linkedin:main Feb 12, 2025
58 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[server] Allocated a dedicated thread pool for ingestion DB lookup #1525

[server] Allocated a dedicated thread pool for ingestion DB lookup #1525

gaojieliu commented Feb 12, 2025

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sixpluszero commented Feb 12, 2025

sushantmane commented Feb 12, 2025 •

edited

Loading

sixpluszero commented Feb 12, 2025

sushantmane commented Feb 12, 2025 •

edited

Loading

[server] Allocated a dedicated thread pool for ingestion DB lookup #1525

[server] Allocated a dedicated thread pool for ingestion DB lookup #1525

Conversation

gaojieliu commented Feb 12, 2025

How was this PR tested?

Does this PR introduce any user-facing changes?

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sushantmane commented Feb 12, 2025

gaojieliu commented Feb 12, 2025

sixpluszero commented Feb 12, 2025

sushantmane commented Feb 12, 2025 • edited Loading

sixpluszero commented Feb 12, 2025

sushantmane commented Feb 12, 2025 • edited Loading

sushantmane commented Feb 12, 2025 •

edited

Loading

sushantmane commented Feb 12, 2025 •

edited

Loading