Implement Transitive Paths using binary search instead of a hash map #1313

JoBuRo · 2024-03-25T13:20:37Z

The previous implementation of transitive paths first converted the subtree (the graph which is traversed) into a HashMap<HashSet<Id>> that represented the adjacency list of the graph. This took a lot of time for large graphs even if the actual transitive path computation was cheap because only a few starting nodes were considered.
We now use the sorted subtree as a graph directly using binary search. This is much faster especially since the subtree is often already presorted as it is a single IndexScan.

Currently both implementations are present in the Code. The active implementation can be switched by using a runtime parameter. The old implementation is kept around mostly for benchmarking purposes, as the new implementation should always be cheaper.

- Renamed nrows, ncols and nvals - Moved code to cpp file - Nullptr is now valid for GrbMatrix::matrix_

- Renamed copy() -> clone() - Functions that create a GrbMatrix now use the unique_ptr directly

- Remove nextIndex_ - Made internal datastructures private - isContained -> contains - const Id& -> Id

- Remove C style arrays - Add checkCancellation

Removed unnecessary unique_ptr and make_optionals

- Throw exception on GrB_NO_VALUE - unique_ptr in GrbMatrix uses custom delter - Simplified GrbMatrixTest

joka921

This is much cleaner now:)

In addition to the concrete suggestions:
Some of the coverage gaps can easily be fixed (corner cases in the implementation, the error about empty paths), especially since they concern code that you have written/ and or touched.

src/engine/TransitivePathBinSearch.cpp

joka921 · 2024-04-15T07:46:29Z

src/engine/TransitivePathBinSearch.h

+    auto range = std::ranges::equal_range(startIds_, node);
+
+    auto startIndex = std::distance(startIds_.begin(), range.begin());
+
+    return targetIds_.subspan(startIndex, range.size());


Suggested change

auto range = std::ranges::equal_range(startIds_, node);

auto startIndex = std::distance(startIds_.begin(), range.begin());

return targetIds_.subspan(startIndex, range.size());

return std::ranges::equal_range(startIds_, node);

Should work, as everything is templated, you don't need to know the type, and it just needs to be iterable.

No, this does not work. The result of the code you provided are the start ids, which match the node and not the target ids.

Example:
startIds_: 1, 1, 2
targetIds_: 2, 3, 4

equal_range(startIds_, 1) returns {1, 1}, but the correct result would be {2, 3}.

Another option would be to write a wrapper for std::pair<Id, Id> with a custom comparison function. And it would be necessary to convert the two std::span<Id> into a single std::span<std::pair<Id, Id>>. But I would argue that this solution is unnecessarily complex, considering the same effect can be achieved with three lines of code.

src/engine/TransitivePathBinSearch.h

src/global/RuntimeParameters.h

test/TransitivePathTest.cpp

Accidentiall fixed too many formats This reverts commit 0d6977f.

joka921

Almost done:

please format the code
4 of the sonarcloud complaints are easy to fix (use =default, use empty() and 2* Use braces.

Otherwise I hope that we can merge this tomorrow:)

joka921 · 2024-04-15T19:02:19Z

src/engine/TransitivePathHashMap.h

+  auto successors(const Id node) const {
+    auto iterator = map_.find(node);
+    if (iterator == map_.end()) {
+      return std::vector<Id>();
+    }
+    std::vector<Id> result(iterator->second.begin(), iterator->second.end());
+    return result;
+  }
+};


Don't copy to a vector, but return a const Set&.
You can make an empty set another member of the HashMapWrapper, so that you have something to return in the case of not found.

I solved this, but I was not sure where to get the ad_utility::AllocatorWithLimit<Id>. There may be a smarter solution.

test/QueryPlannerTest.cpp

test/QueryPlannerTestHelpers.h

test/TransitivePathTest.cpp

sonarqubecloud · 2024-04-16T17:59:25Z

Quality Gate passed

Issues
10 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

JoBuRo and others added 30 commits January 30, 2024 13:32

Added graphblas dependecies

fd1bdf8

Added wrapper for graphblas matrix

285528b

Replaced transitiveHull computation

935bc0d

Added extern keyword around include

069819f

Replaced std map with abseil map

4b257fe

Added graphblas dependency for GH action

d4ba6f6

Added library for mac build

1b20a1e

Removed finalize()

92621d0

Added fallback for GraphBLAS

b3d71bd

Reworks GrBMatrix

d1de484

- Renamed nrows, ncols and nvals - Moved code to cpp file - Nullptr is now valid for GrbMatrix::matrix_

More reworks on GrbMatrix

303ff2f

- Renamed copy() -> clone() - Functions that create a GrbMatrix now use the unique_ptr directly

Renamed getMatrix -> matrix

958b4f7

Added documentation to GrbMatrix

87bc650

Reworked build function

370ad43

Reworked extractRow and extractCol

0018c84

Reworked use of C arrays in GrbMatrix

622dddd

Additional reworks for GrbMatrix

7c7d5b6

Reworked extractTuples

21a5994

Added a quick fix for GrB_init issue

6f0f8cd

Reworked IdMapping

7269e4c

- Remove nextIndex_ - Made internal datastructures private - isContained -> contains - const Id& -> Id

Reworks on TransitivePath

97d766a

- Remove C style arrays - Add checkCancellation

Merge branch 'master' into use-graphblas

c8c6526

A tiny bugfix and make the stuff configurable.

a06ed71

Fix build error, add move assignment to GrbMatrix

6e84e8a

Simplifications

005dc9c

Removed unnecessary unique_ptr and make_optionals

Reworks

e7dbf4d

- Throw exception on GrB_NO_VALUE - unique_ptr in GrbMatrix uses custom delter - Simplified GrbMatrixTest

Added timer to transitive path computation

714744a

**WIP** Refactor of TransitivePath into Fallback and Graphblas

0fd9f92

Fixed timing conversion

60a37b6

Added singleton class for GraphBLAS global context

d07401e

JoBuRo added 4 commits April 12, 2024 17:44

Added docs

e9b63ec

Declared TransitivePathBase dtor as pure virtual

ee12697

Simplified setupEdges in TransitivePathBinSearch

47c19ee

Fixed an issue with move semantics and ctors

da97798

joka921 requested changes Apr 15, 2024

View reviewed changes

JoBuRo added 8 commits April 15, 2024 13:42

Implemented HashMapWrapper

e31afd7

Moved transitiveHull function to TransitiveHullImpl

0a33997

const auto& in transitiveHull

6398363

Replaced assertSameUnorderedContent with gmock function

8391366

Changed RuntimeParameter for transitivePath to true

9821e28

Format fix

0d6977f

Added some docs

6cf3af1

Revert "Format fix"

d2ea890

Accidentiall fixed too many formats This reverts commit 0d6977f.

joka921 requested changes Apr 15, 2024

View reviewed changes

JoBuRo and others added 12 commits April 16, 2024 09:45

Format fix

7f70dd3

Sonar Fixes

199d390

Added unit test for exception

c139244

Added tests for 'zero or more' transitive path

5a48b5f

HashMapWrapper successors function returns const Set&

79f5420

Use shorthand for QueryPlanner sort matcher

555aa6f

Simplified tests a bit V(x) -> x

0faa6d8

Added a todo

4eb14ed

Added doc for HashMapWrapper

1d496a3

Format fix

9d6eaf3

Try to fix the MacOS build

a768eb0

Format fix

8a0bee4

joka921 approved these changes Apr 17, 2024

View reviewed changes

joka921 changed the title ~~Improve transitive path with binary search~~ Implement Transitive Paths using binary search instead of a hash map Apr 17, 2024

joka921 merged commit 08ca01b into ad-freiburg:master Apr 17, 2024
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Transitive Paths using binary search instead of a hash map #1313

Implement Transitive Paths using binary search instead of a hash map #1313

JoBuRo commented Mar 25, 2024 •

edited by joka921

Loading

joka921 left a comment

joka921 Apr 15, 2024

JoBuRo Apr 15, 2024

joka921 left a comment

joka921 Apr 15, 2024

JoBuRo Apr 16, 2024

sonarqubecloud bot commented Apr 16, 2024

Implement Transitive Paths using binary search instead of a hash map #1313

Implement Transitive Paths using binary search instead of a hash map #1313

Conversation

JoBuRo commented Mar 25, 2024 • edited by joka921 Loading

joka921 left a comment

Choose a reason for hiding this comment

joka921 Apr 15, 2024

Choose a reason for hiding this comment

JoBuRo Apr 15, 2024

Choose a reason for hiding this comment

joka921 left a comment

Choose a reason for hiding this comment

joka921 Apr 15, 2024

Choose a reason for hiding this comment

JoBuRo Apr 16, 2024

Choose a reason for hiding this comment

sonarqubecloud bot commented Apr 16, 2024

Quality Gate passed

JoBuRo commented Mar 25, 2024 •

edited by joka921

Loading