-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve transitive path computation with GraphBLAS #1252
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a first pass over the Matrix class, haven't looked at the Transitive Path and the tests yet.
@@ -41,6 +41,7 @@ jobs: | |||
run: | | |||
brew install llvm@16 | |||
brew install conan@2 | |||
brew install suite-sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This currently doesn't seem to work,
I will try to figure out how to do this on our Mac.
What we also could do is to compile GraphBLAS alongside QLever, but only if
it is explicitly requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(of course this would increase build times, I'd have to check if this is feasible, and probably it would be problematic for CI builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass over the rest!
src/engine/TransitivePath.h
Outdated
bool isContained(Id id) { return idMap_.contains(id); } | ||
|
||
size_t addId(Id id) { | ||
if (!idMap_.contains(id)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make this nicer and more efficient by using
- idMap_.try_emplace().
- The fact the
nextIndex_
is always equal toidMap_.size()
(the explicit data member is not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and equal to indexMap
.size).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I'm using the try_emplace as you intended, but I think still need to check if idMap_
contains id
. If the id
is not contained I need to add to the vector indexMap_
.
src/engine/TransitivePath.h
Outdated
ad_utility::HashMap<Id, size_t> idMap_{}; | ||
|
||
std::vector<Id> indexMap_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make them private and include memory limits via the AllocatorWithMemoryLimit
(can be applied everywhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried using ad_utility::HashMapWithMemoryLimit<Id, size_t>
, but on compile I get:
error: call to deleted constructor of 'allocator_type' (aka 'ad_utility::AllocatorWi thLimit<std::pair<const ValueId, unsigned long>>')
src/engine/TransitivePath.cpp
Outdated
rowIndex++; | ||
table(resultRowIndex, startSideCol) = startNode; | ||
table(resultRowIndex, targetSideCol) = targetNode; | ||
resultRowIndex++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a look at the possibilities of GraphBlas to check what is the most efficient way to get a set of rows out.
src/engine/TransitivePath.cpp
Outdated
size_t targetIndex) const { | ||
GrbMatrix transformer = GrbMatrix(hull.ncols(), hull.ncols()); | ||
transformer.setElement(targetIndex, targetIndex, true); | ||
return hull.multiply(transformer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't actually this the solution to get multiple rows at once if you have multiple start indices? Just with a transformer
that has more than one entry?
That way you don't have that many getRow()
calls, but a single "getAllRows" followed by an extraction.
src/engine/TransitivePath.cpp
Outdated
for (size_t i = 0; i < numRows; i++) { | ||
auto startId = startCol[i]; | ||
auto targetId = targetCol[i]; | ||
auto startIndex = mapping.addId(startId); | ||
auto targetIndex = mapping.addId(targetId); | ||
|
||
rowIndices.push_back(startIndex); | ||
colIndices.push_back(targetIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since both your input and output format are column based, write a lambda that handles one of the columns, and call it twice. (Maybe this can even be a member function addAll
of the mapping.
And in general: If you know the needed size of a vector in advance, you should reserve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An idea to increase the visibility of the fallback.
src/engine/TransitivePath.h
Outdated
// A small helper function: Insert the `value` to the set at `map[key]`. | ||
// As the sets all have an allocator with memory limit, this construction is a | ||
// little bit more involved, so this can be a separate helper function. | ||
void insertIntoMap(Map& map, Id key, Id value) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much.
I think we should definitely split the two implementations.
I think we can just use basic inheritance:
-
A class
TransitivePathBase
that has all the stuff that the class currently has EXCEPT for the override ofcomputeResult()
+ all the helper functions needed for it. -
One class TransitivePathGraphblas and one class TransitivePathFallback that both inherit from the base and implement their repsecive
computeResult
s. -
: The base class has a function
static shared_pr<TransitivePathBase> makeTransitivePath(..... all the args, bool use fallback). (Trick: only declare it in the header, and implement it in the
.cpp` file, in which you can include both implementations.
- Renamed nrows, ncols and nvals - Moved code to cpp file - Nullptr is now valid for GrbMatrix::matrix_
- Renamed copy() -> clone() - Functions that create a GrbMatrix now use the unique_ptr directly
- Remove nextIndex_ - Made internal datastructures private - isContained -> contains - const Id& -> Id
- Remove C style arrays - Add checkCancellation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1252 +/- ##
==========================================
- Coverage 88.10% 86.94% -1.16%
==========================================
Files 310 310
Lines 28436 28410 -26
Branches 3148 3168 +20
==========================================
- Hits 25053 24701 -352
- Misses 2231 2552 +321
- Partials 1152 1157 +5 ☔ View full report in Codecov by Sentry. |
Removed unnecessary unique_ptr and make_optionals
# Conflicts: # src/global/Constants.h # test/TransitivePathTest.cpp
|
We have now merged the better implementation that uses a binary search, |
This PR changes the implementation of the TransitivePath class.
Previously it used DFS to compute the transitive hull. The new implementation uses sparse, boolean matrices instead. The matrix object and functionalities come from the SuiteSparse:GraphBLAS library [website | github] .
This library is added as new dependency. The new class GrbMatrix wraps the library for this application.