Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Transitive Paths using binary search instead of a hash map #1313

Merged
merged 98 commits into from
Apr 17, 2024

Conversation

JoBuRo
Copy link
Contributor

@JoBuRo JoBuRo commented Mar 25, 2024

The previous implementation of transitive paths first converted the subtree (the graph which is traversed) into a HashMap<HashSet<Id>> that represented the adjacency list of the graph. This took a lot of time for large graphs even if the actual transitive path computation was cheap because only a few starting nodes were considered.
We now use the sorted subtree as a graph directly using binary search. This is much faster especially since the subtree is often already presorted as it is a single IndexScan.

Currently both implementations are present in the Code. The active implementation can be switched by using a runtime parameter. The old implementation is kept around mostly for benchmarking purposes, as the new implementation should always be cheaper.

JoBuRo and others added 30 commits January 30, 2024 13:32
- Renamed nrows, ncols and nvals
- Moved code to cpp file
- Nullptr is now valid for GrbMatrix::matrix_
- Renamed copy() -> clone()
- Functions that create a GrbMatrix now use the unique_ptr directly
- Remove nextIndex_
- Made internal datastructures private
- isContained -> contains
- const Id& -> Id
- Remove C style arrays
- Add checkCancellation
Removed unnecessary unique_ptr and make_optionals
- Throw exception on GrB_NO_VALUE
- unique_ptr in GrbMatrix uses custom delter
- Simplified GrbMatrixTest
Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much cleaner now:)

In addition to the concrete suggestions:
Some of the coverage gaps can easily be fixed (corner cases in the implementation, the error about empty paths), especially since they concern code that you have written/ and or touched.

Comment on lines +50 to +54
auto range = std::ranges::equal_range(startIds_, node);

auto startIndex = std::distance(startIds_.begin(), range.begin());

return targetIds_.subspan(startIndex, range.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto range = std::ranges::equal_range(startIds_, node);
auto startIndex = std::distance(startIds_.begin(), range.begin());
return targetIds_.subspan(startIndex, range.size());
return std::ranges::equal_range(startIds_, node);

Should work, as everything is templated, you don't need to know the type, and it just needs to be iterable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this does not work. The result of the code you provided are the start ids, which match the node and not the target ids.

Example:
startIds_: 1, 1, 2
targetIds_: 2, 3, 4

equal_range(startIds_, 1) returns {1, 1}, but the correct result would be {2, 3}.

Another option would be to write a wrapper for std::pair<Id, Id> with a custom comparison function. And it would be necessary to convert the two std::span<Id> into a single std::span<std::pair<Id, Id>>. But I would argue that this solution is unnecessarily complex, considering the same effect can be achieved with three lines of code.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done:

  • please format the code
  • 4 of the sonarcloud complaints are easy to fix (use =default, use empty() and 2* Use braces.

Otherwise I hope that we can merge this tomorrow:)

Comment on lines 18 to 26
auto successors(const Id node) const {
auto iterator = map_.find(node);
if (iterator == map_.end()) {
return std::vector<Id>();
}
std::vector<Id> result(iterator->second.begin(), iterator->second.end());
return result;
}
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't copy to a vector, but return a const Set&.
You can make an empty set another member of the HashMapWrapper, so that you have something to return in the case of not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I solved this, but I was not sure where to get the ad_utility::AllocatorWithLimit<Id>. There may be a smarter solution.

Copy link

Quality Gate Passed Quality Gate passed

Issues
10 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@joka921 joka921 changed the title Improve transitive path with binary search Implement Transitive Paths using binary search instead of a hash map Apr 17, 2024
@joka921 joka921 merged commit 08ca01b into ad-freiburg:master Apr 17, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants