Skip to content

Conversation

@coltmcnealy-lh
Copy link
Member

This backwards-compatible change puts all getable's such as TaskRun's and NodeRun's for a single WfRun onto the same prefix. Doing this will put most of the keys for a WfRun onto the same block in RocksDB (or in adjacent blocks), which optimizes I/O both in compaction and in cold reads.

@coltmcnealy-lh
Copy link
Member Author

I need to do some work on the objectIdPrefixScan() logic to support this change.

This backwards-compatible change puts all getable's such as TaskRun's and NodeRun's for a single WfRun onto the same prefix. Doing this will put most of the keys for a WfRun onto the same block in RocksDB (or in adjacent blocks), which optimizes I/O both in compaction and in cold reads.
@coltmcnealy-lh
Copy link
Member Author

Setup:

  • three servers
  • 200 tasks per second
  • 24-hour retention

This is the Disk I/O when retention starts with this change:
image

THis is the Disk I/O when retention starts before this change:
image

@coltmcnealy-lh
Copy link
Member Author

Canary tail latencies also seem to improve a little bit but it's slight. Here's the before:
image

And here's the after:
image

@coltmcnealy-lh
Copy link
Member Author

@l4crito Great work! I tested your changes since I left it off and it all works.

@l4crito l4crito marked this pull request as ready for review November 6, 2025 17:25
Copy link
Member Author

@coltmcnealy-lh coltmcnealy-lh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. But technically I opened this PR so I can't approve it myself.


String endKey = req.boundedObjectIdScan.getEndObjectId() + "~";
String endKey =
StoredGetable.getRocksDBKey(req.boundedObjectIdScan.getStartObjectId(), req.getObjectType()) + "~";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we are using getStartObjectId() instead of getEndObjectId()? It might work but still it seems not so kosher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was a reason but i don't remember exactly what it was, let me debug really quick

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that getEndObjectId() will return something like wfrunId~ and when we build the prefix from it, it is gonna be /tenant/storable/wrg/wfrunId~/storable/gettable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create a follow-up ticket with type Refactor to address this? We should properly refactor the Internal Scan protobuf to understand the concept of grouping WfRun's. I think that would be a cleaner implementation; but it can wait for a while longer I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A necessary sacrifice 🫡

return getFullStoreKey(getType(), getStoreKey());
if (getGroupingWfRunId().isPresent()) {
if (!(this instanceof StoredGetable)) {
throw new NotImplementedException();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused, how does this work with WfRunStoredInventoryModel? Are those objects stored in a different key prefix?

@l4crito l4crito merged commit bfd8f6e into master Nov 13, 2025
28 checks passed
@l4crito l4crito deleted the perf/index-locality branch November 13, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants