Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the patterns from the permutations and no longer load ql:has-pattern into RAM #1223

Merged
merged 132 commits into from
Jan 18, 2024
Merged
Show file tree
Hide file tree
Changes from 131 commits
Commits
Show all changes
132 commits
Select commit Hold shift + click to select a range
095bdd3
Not yet working.
joka921 Sep 6, 2023
2470c0c
The normal pattern trick is working, next do the pattern trick for al…
joka921 Sep 7, 2023
ffe16aa
Full pattern trick also works.
joka921 Sep 7, 2023
29cc94b
Throwing out the has-predicate-scan, because all the E2E-tests seem t…
joka921 Sep 7, 2023
256f17d
Completely threw out the unneded code from the has-predicate-scan.
joka921 Sep 7, 2023
1805ee5
Down with the RAM usage!
joka921 Sep 7, 2023
98ab8a5
Cleaner handling of the special IDs.
joka921 Sep 7, 2023
c401367
Bump the index format version.
joka921 Sep 7, 2023
d02acee
Fix the OpenMP bugs.
joka921 Sep 7, 2023
e115e19
Several improvements from a self-review.
joka921 Sep 7, 2023
e4e7bd3
Merge branch 'master' into patterns-on-disk
joka921 Sep 7, 2023
5ab2a53
A small fix etc.
joka921 Sep 7, 2023
fcb20fc
Commented out the failing tests to make codecov active.
joka921 Sep 7, 2023
5cebbe2
Show the memory usage of the failing codecov runner.
joka921 Sep 7, 2023
b45678b
Try to fix the Codecov OOM problems.
joka921 Sep 8, 2023
1d4f536
stupidity
joka921 Sep 8, 2023
d849b75
Merge branch 'master' into patterns-on-disk
joka921 Oct 4, 2023
62aed1e
Merge in the current master
joka921 Oct 4, 2023
065e2c3
Prepare a lot of code for theactual storing of the patterns.
joka921 Oct 4, 2023
db08fae
Added functionality (untested yet) to export additional columns.
joka921 Oct 5, 2023
960e32b
The subject based patterns already seem to work like a charm.
joka921 Oct 5, 2023
ec1d230
Stopping for today.
joka921 Oct 5, 2023
96e46fe
This might work, but now we first let a DBLP build run.
joka921 Oct 6, 2023
ac1407b
This seems to work and answer simple queries....
joka921 Oct 6, 2023
bea4c59
Fix a subtle bug.
joka921 Oct 6, 2023
9617343
Trying to do the
joka921 Oct 9, 2023
09fa62f
Trying to do the
joka921 Oct 9, 2023
5aa272f
Add the ability to store additional columns in the relations.
joka921 Oct 9, 2023
e98b7cf
Before a review.
joka921 Oct 9, 2023
2a7b1d2
Add tests and clean up some code.
joka921 Oct 10, 2023
f090845
compress the columns in parallel.
joka921 Oct 10, 2023
d67d791
Fix some code smells but don't overexaggerate it.
joka921 Oct 10, 2023
b035f00
Merge branch 'master' into Add-support-for-additional-columns
joka921 Nov 27, 2023
91a13d5
First tests compile, but fail, todo: continue the merging.
joka921 Nov 27, 2023
65e3916
closer to compilation.
joka921 Nov 27, 2023
1af228f
Fix the tests etc.
joka921 Nov 27, 2023
f73dbef
Merge branch 'master' into Add-support-for-additional-columns
joka921 Nov 27, 2023
a616038
A round of self-reviews.
joka921 Nov 28, 2023
d750017
Get rid of quite some code duplication.
joka921 Nov 28, 2023
e932489
Merge branch 'master' into Add-support-for-additional-columns
joka921 Nov 28, 2023
219281e
Merge branch 'Add-support-for-additional-columns' into patterns-on-disk
joka921 Nov 28, 2023
5c7526f
In the middle of fixing the merge...
joka921 Nov 28, 2023
d46fb82
Most of the stuff that fails is because of the missing has-predicate …
joka921 Nov 28, 2023
84da7ed
Something also isn't quite right here concerning the number of blocks…
joka921 Nov 28, 2023
d2fd604
some changes from a review with Hannah.
joka921 Nov 28, 2023
08c570c
Factored out several functions...
joka921 Nov 29, 2023
a5c6e01
Next step : ttry a different order.
joka921 Nov 29, 2023
e7c0e65
Fix a bug in the block exporter.
joka921 Nov 29, 2023
a75264e
This is ready for a first round of reviews.
joka921 Nov 29, 2023
6411082
Fix the build.
joka921 Nov 29, 2023
c211f69
Add a comment and reforma.t
joka921 Nov 29, 2023
6cc2e99
Merge branch 'master' into change-permutation-building-order
joka921 Nov 29, 2023
411498d
Fix the test failure that originated in the merge.
joka921 Nov 29, 2023
5f596a3
Add a random payload (but it is not yet stored in the columns...)
joka921 Nov 29, 2023
17325a3
Trying to get the reight start...
joka921 Nov 29, 2023
517a49b
Merge branch 'Add-support-for-additional-columns' into add-payloads-t…
joka921 Nov 29, 2023
01cdbc2
A first try of checking the performance...
joka921 Nov 29, 2023
4ae4144
Use a type-erased sorter for the first permutation.
joka921 Nov 29, 2023
289c035
Small changes from a review.
joka921 Nov 29, 2023
289a67d
Remove an unused function and an unused file.
joka921 Nov 30, 2023
71b3ec6
Some additional cleanups that will make life easier for us.
joka921 Nov 30, 2023
79f0055
Merge branch 'change-permutation-building-order' into add-payloads-to…
joka921 Nov 30, 2023
61aebe3
Already add the additional columns for the pattern trick at least for…
joka921 Nov 30, 2023
1edfd47
Allow optional joins with blocks as soon as there are no preexisting …
joka921 Nov 30, 2023
90e08ce
Finish the merge of master
joka921 Dec 1, 2023
a9e7820
The times for index building look pretty acceptable, but we have to c…
joka921 Dec 1, 2023
c9bb322
Merge branch 'master' into write-patterns-to-all-permutations
joka921 Dec 7, 2023
1c6c2bf
A first draft of this PR, yet to be cleaned up.
joka921 Dec 7, 2023
04bb6c7
Some initial refactoring.
joka921 Dec 7, 2023
8f4d433
Before continuing to some other stuff.
joka921 Dec 8, 2023
476d9c2
Clean up this and that.
joka921 Dec 8, 2023
b85fe7a
Merge branch 'master' into write-patterns-to-all-permutations
joka921 Dec 21, 2023
087f82f
Try this out with Hannah
joka921 Dec 21, 2023
1a718d4
The IDE is currently doing shenanigans...
joka921 Dec 22, 2023
091eb78
Started some heavy refactoring before trying to track the bug.
joka921 Dec 22, 2023
268fe39
Even more refactoring.
joka921 Jan 10, 2024
fd9b59d
Heavy refactorings, let our tools tell what they think about it.
joka921 Jan 10, 2024
66ea332
Fix a bug...
joka921 Jan 10, 2024
152ecec
Merge branch 'master' into write-patterns-to-all-permutations
joka921 Jan 10, 2024
91446c6
Some review stuff and the bugfix.
joka921 Jan 10, 2024
1f52072
Added several unit tests.
joka921 Jan 11, 2024
a4cf4ac
Increase the test coverage further and while doing so understand the …
joka921 Jan 11, 2024
4bcbb6c
Further test coverage stuff.
joka921 Jan 11, 2024
53d954a
Changes from a review.
joka921 Jan 11, 2024
1d5fd7c
Changes from a review.
joka921 Jan 11, 2024
53108f3
Some more improvements.
joka921 Jan 11, 2024
c4c2658
Comments improvements tests.
joka921 Jan 12, 2024
e19a219
Small bugfix.
joka921 Jan 12, 2024
ff5a92b
Another one
joka921 Jan 12, 2024
e1eab73
Another one
joka921 Jan 12, 2024
de4e41b
Changes from a review.
joka921 Jan 12, 2024
c76d93d
Another round.
joka921 Jan 12, 2024
c99b0d5
Merge branch 'write-patterns-to-all-permutations' into use-new-patterns
joka921 Jan 12, 2024
990db38
A first draft after merging everything, asses what has been changed.
joka921 Jan 12, 2024
5e17cb5
Merge branch 'master' into use-new-patterns
joka921 Jan 12, 2024
a2e0265
This could work now much better...
joka921 Jan 12, 2024
c79d25f
I think I've got it.
joka921 Jan 12, 2024
a6ec4f1
Make the has-predicate scans work again.
joka921 Jan 15, 2024
adb18ea
Try to figure out where I broke the text indices.
joka921 Jan 15, 2024
9096633
Further cleaning this up.
joka921 Jan 15, 2024
a7e246c
Further cleanups.
joka921 Jan 15, 2024
cc1c0b0
Better stuff.
joka921 Jan 15, 2024
33ffb80
Merge branch 'master' into use-new-patterns
joka921 Jan 15, 2024
b2d7e4f
Next
joka921 Jan 15, 2024
f173ec4
Yet another bugfix.
joka921 Jan 15, 2024
7699532
Add several tests and improve on this and that.
joka921 Jan 16, 2024
f2703c8
Initial commit here.
joka921 Jan 16, 2024
9f8bcf3
Some cleanups and some tests.
joka921 Jan 16, 2024
f603a50
Add more consistent tests.
joka921 Jan 16, 2024
ca4fa19
clang format.
joka921 Jan 16, 2024
b05d5ce
Greatly simplified all this.
joka921 Jan 17, 2024
0b0a797
Several cleanups and preparations for the larger PR.
joka921 Jan 17, 2024
1ea357f
Merge branch 'master' into additional-permutations
joka921 Jan 17, 2024
917a43d
Current master etc.
joka921 Jan 17, 2024
0f707fe
Add unit tests etc.
joka921 Jan 17, 2024
174ea90
A round of reviews with Hannah.
joka921 Jan 17, 2024
4a9bf23
Some additional small reviews.
joka921 Jan 17, 2024
5dc7515
Merge branch 'additional-permutations' into use-new-patterns
joka921 Jan 17, 2024
038cd0b
The merge is still broken...
joka921 Jan 17, 2024
bd0f86a
We still have to manually figure out the merge afterwards, there are …
joka921 Jan 17, 2024
130b01e
Merge branch 'master' into use-new-patterns
joka921 Jan 17, 2024
4de17ad
Revert to the old version.
joka921 Jan 17, 2024
70aecf3
Some refactorings of the CheckUsePatternTrick module.
joka921 Jan 17, 2024
c9e3477
Several additional things.
joka921 Jan 18, 2024
fa24da5
A round of self-reviews.
joka921 Jan 18, 2024
394f379
Several additional improvements and self-reviews.
joka921 Jan 18, 2024
e59ee45
Merge branch 'master' into use-new-patterns
joka921 Jan 18, 2024
b3f7e38
A round of reviews.
joka921 Jan 18, 2024
3a9f8a5
Moved underscores from the front to the back
joka921 Jan 18, 2024
8dfd2ef
Fix the date in the index version.
joka921 Jan 18, 2024
52857c7
Change the date again.
joka921 Jan 18, 2024
6313a71
Rename the patternCreatorNew to PatternCreator again.
joka921 Jan 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 80 additions & 13 deletions src/engine/CheckUsePatternTrick.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,82 @@ bool isVariableContainedInGraphPatternOperation(
});
}

// Internal helper function.
// Modify the `triples` s.t. the patterns for `subAndPred.subject_` will appear
// in a column with the variable `subAndPred.predicate_` when evaluating and
// joining all the triples. This can be either done by retrieving one of the
// additional columns where the patterns are stored in the PSO and POS
// permutation or, if no triple suitable for adding this column exists, by
// adding a triple `?subject ql:has-pattern ?predicate`.
static void rewriteTriplesForPatternTrick(const PatternTrickTuple& subAndPred,
std::vector<SparqlTriple>& triples) {
// The following lambda tries to find a triple in the `triples` that has the
// subject variable of the pattern trick in its `triplePosition` (which is
// either the subject or the object) and a fixed predicate (no variable). If
// such a triple is found, it is modified s.t. it also scans the
// `additionalScanColumn` which has to be the index of the column where the
// patterns of the `triplePosition` are stored in the POS and PSO permutation.
// Return true iff such a triple was found and replaced.
auto findAndRewriteMatchingTriple = [&subAndPred, &triples](
auto triplePosition,
size_t additionalScanColumn) {
auto matchingTriple = std::ranges::find_if(
triples, [&subAndPred, triplePosition](const SparqlTriple& t) {
return std::invoke(triplePosition, t) == subAndPred.subject_ &&
t._p.isIri() && !isVariable(t._p);
});
if (matchingTriple == triples.end()) {
return false;
}
matchingTriple->_additionalScanColumns.emplace_back(additionalScanColumn,
subAndPred.predicate_);
return true;
};

if (findAndRewriteMatchingTriple(&SparqlTriple::_s,
ADDITIONAL_COLUMN_INDEX_SUBJECT_PATTERN)) {
return;
} else if (findAndRewriteMatchingTriple(
&SparqlTriple::_o, ADDITIONAL_COLUMN_INDEX_OBJECT_PATTERN)) {
return;
} else {
// We could not find a suitable triple to append the additional column, we
// therefore add an explicit triple `?s ql:has_pattern ?p`
triples.emplace_back(subAndPred.subject_, HAS_PATTERN_PREDICATE,
subAndPred.predicate_);
}
}

// Helper function for `checkUsePatternTrick`.
// Check if any of the triples in the `graphPattern` has the form `?s
// ql:has-predicate ?p` or `?s ?p ?o` and that the other conditions for the
// pattern trick are fulfilled (nameley that the variables `?p` and if present
// `?o` don't appear elsewhere in the `parsedQuery`. If such a triple is found,
// the query is modified such that it behaves as if the triple was replace by
// `?s ql:has-pattern ?p`. See the documentation of
// `rewriteTriplesForPatternTrick` above.
static std::optional<PatternTrickTuple> findPatternTrickTuple(
p::BasicGraphPattern* graphPattern, const ParsedQuery* parsedQuery,
const std::optional<
sparqlExpression::SparqlExpressionPimpl::VariableAndDistinctness>&
countedVariable) {
// Try to find a triple that either has `ql:has-predicate` as the predicate,
// or consists of three variables, and fulfills all the other preconditions
// for the pattern trick.
auto& triples = graphPattern->_triples;
for (auto it = triples.begin(); it != triples.end(); ++it) {
auto patternTrickTuple =
isTripleSuitableForPatternTrick(*it, parsedQuery, countedVariable);
if (!patternTrickTuple.has_value()) {
continue;
}
triples.erase(it);
rewriteTriplesForPatternTrick(patternTrickTuple.value(), triples);
return patternTrickTuple;
}
return std::nullopt;
}

// ____________________________________________________________________________
std::optional<PatternTrickTuple> checkUsePatternTrick(
ParsedQuery* parsedQuery) {
Expand Down Expand Up @@ -109,19 +185,10 @@ std::optional<PatternTrickTuple> checkUsePatternTrick(
continue;
}

// Try to find a triple that either has `ql:has-predicate` as the predicate,
// or consists of three variables, and fulfills all the other preconditions
// for the pattern trick.
auto& triples = curPattern->_triples;
for (auto it = triples.begin(); it != triples.end(); ++it) {
auto patternTrickTuple =
isTripleSuitableForPatternTrick(*it, parsedQuery, countedVariable);
if (patternTrickTuple.has_value()) {
// Remove the triple from the graph. Note that this invalidates the
// reference `triple`, so we perform this step at the very end.
triples.erase(it);
return patternTrickTuple;
}
auto patternTrickTuple =
findPatternTrickTuple(curPattern, parsedQuery, countedVariable);
if (patternTrickTuple.has_value()) {
return patternTrickTuple;
}
}
// No suitable triple for the pattern trick was found.
Expand Down
9 changes: 7 additions & 2 deletions src/engine/CheckUsePatternTrick.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,13 @@ struct PatternTrickTuple {
* @brief Determines if the pattern trick (and in turn the
* CountAvailablePredicates operation) is applicable to the given
* parsed query. If a ql:has-predicate triple is found and
* CountAvailablePredicates can be used for it, the triple will be removed from
* the parsed query.
* CountAvailablePredicates can be used for it, the triple's predicate will be
* replaced by `ql:has-pattern`. If possible, then this rewrite is performed by
* completely removing the triple and adding the pattern as an
* additional scan column to one of the other triples (note that we have folded
* the patterns for the subject and object into the PSO and POS permutation).
* The mapping from the pattern to the predicates contained in that pattern will
* later be done by the `CountAvailablePredicates` operation.
*/
std::optional<PatternTrickTuple> checkUsePatternTrick(ParsedQuery* parsedQuery);

Expand Down
Loading
Loading