Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version vector: compute locations only once during commits. #11924

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dlambrig
Copy link
Contributor

@dlambrig dlambrig commented Feb 4, 2025

During commits with version vector enabled, compute location list only once on the resolver. Recalculating a second time could generate a different random number, hence a different set of locations. The location list is generted with a call to getPushLocations, which chooses random locations if the total number of tLogs mod the replica count results in collisions.

Tested by loading commits from branch version-vector-disk-queue and observing getPushLocations is called in both the resolver and on the CP in assignMutationsToStorageServers. Observed different locations chosen without the PR, resulting in data not sent to tLogs when it should have been. Added an assert if data was generated for a location but was not sent to a tLog to catch any such future cases.

Joshua 20250204-185355-dlambrig-83a293ba8c4a1d87

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

…y once, as recalcuating could

generate a different random number, hence a different set of locations.
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: c54c727
  • Duration 0:21:25
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: c54c727
  • Duration 0:39:38
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: c54c727
  • Duration 0:47:22
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: c54c727
  • Duration 0:54:57
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: c54c727
  • Duration 1:02:15
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: c54c727
  • Duration 1:19:12
  • Result: ❌ FAILED
  • Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@dlambrig dlambrig requested review from jzhou77 and sbodagala February 4, 2025 18:15
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: c54c727
  • Duration 1:33:00
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@dlambrig dlambrig marked this pull request as draft February 4, 2025 18:52
@dlambrig dlambrig marked this pull request as ready for review February 4, 2025 18:55
@@ -842,6 +845,15 @@ struct LogPushData : NonCopyable {
Tag chooseRouterTag() {
return savedRandomRouterTag.present() ? savedRandomRouterTag.get() : logSystem->getRandomRouterTag();
}

void getPushLocations(std::vector<Tag> const& tags, std::vector<int>& locations, bool allLocations = false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not using tags, which can be a much smaller set than the set storedLocations. So this doesn't seem right.

The idea of logSystem->getPushLocations() is to find the tlogs that should receive the tags. We don't want to send the message to a tlog that doesn't have to receive it.

Copy link
Contributor Author

@dlambrig dlambrig Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we may send the message to more tLogs than we need to, which is suboptimal. I will think of a way to optimize that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I thought more about it, this is called by writeTypedMessage(), which is called almost on every mutation, e.g., in writeMutation(). So I think we need to optimize this to avoid sending every mutation to all tlogs.

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux CentOS 7

  • Commit ID: 3c7b5cd
  • Duration 0:20:21
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 3c7b5cd
  • Duration 0:38:18
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 3c7b5cd
  • Duration 0:46:22
  • Result: ❌ FAILED
  • Error: Error while executing command: ctest -j ${NPROC} --no-compress-output -T test --output-on-failure. Reason: exit status 8
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

  • Commit ID: 3c7b5cd
  • Duration 0:57:35
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 3c7b5cd
  • Duration 1:03:41
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux CentOS 7

  • Commit ID: 3c7b5cd
  • Duration 1:20:40
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux CentOS 7

  • Commit ID: 3c7b5cd
  • Duration 1:24:53
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@@ -853,8 +865,7 @@ void LogPushData::writeTypedMessage(T const& item, bool metadataMessage, bool al
for (auto& tag : next_message_tags) {
prev_tags.push_back(tag);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between "prev_tags" and "tags"? And, why is "tags" being used in place of "prev_tags" in the other/later commit?

Copy link
Contributor

@sbodagala sbodagala Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are we sure that the "prev_tags"/"tags" that is being used here is exactly the same set of tags that the resolver uses too? Or, is it possible that meta mutations (like, move_keys) can cause this set to be different on the resolver and the proxy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between "prev_tags" and "tags"? And, why is "tags" being used in place of "prev_tags" in the other/later commit?

They are the same. I can remove tags and just use prev_tags as it is a class member.

Copy link
Contributor Author

@dlambrig dlambrig Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, are we sure that the "prev_tags"/"tags" that is being used here is exactly the same set of tags that the resolver uses too? Or, is it possible that meta mutations (like, move_keys) can cause this set to be different on the resolver and the proxy?

Recall that the reason we moved metadata mutation processing to the Resolver was to handle move_keys and similar cases. We did this in order to calculate the tpcv locations out of the complete set of tags for the batch. Version vector will not work unless that set is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants