Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement
Union
that can keep sort order #1834base: master
Are you sure you want to change the base?
Implement
Union
that can keep sort order #1834Changes from all commits
5e4a888
7990fe4
0b69bc0
c6665e9
474f936
df1f21e
399fb39
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 355 in src/engine/Union.cpp
src/engine/Union.cpp#L353-L355
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto [index1, index2] = _columnOrigins.at(col)
And measure, whether the check in the at() function costs something, and
whether we get a speedup by reordering the columns, such that they are alwasy 0, 1, ... in both inputs and the lookup in
_columnOrigins
in each loop disappears(if you want to, first create a separate benchmark, if you don't want to do all the permutation reordering first, although it is rather simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could remove all the sort columns after the first NO_COLUM, and precompute an
efficient return value
(If all the columns are equal, do i return true (because there was NO_COLUMN in the left input) or false (because the NO_COLUMN was right, or because all columns were equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure what exactly you want. I can't initially trim
targetOrder_
, because we still need the original values elsewhere. Otherwise this information is lost. If you're proposing adding a copy that is tailored towards the comparator, to do better loop unrolling I could certainly do that, but I don't think sort orders that just apply to a single child are common, so I'm not sure if it's worth the added complexity.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone should Evalute how costly this " I am doing rowwise pushes on a columnwise data structure" is
(Is this where you spend most of the time?)
We should extend the
AddCombinedRowAdder
to not only handle joins, but also UNIONs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are moving away from coroutines for our C++17 backport,
please consider which types of ranges or state machines (see the end of the
Iterators.h
header)can be used here.
This unfortunately will have some more lines of code, but we will benefit from it.