-
Notifications
You must be signed in to change notification settings - Fork 107
Add 'indirect_sort' #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
mclow
wants to merge
7
commits into
develop
Choose a base branch
from
indirectSort
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add 'indirect_sort' #117
Changes from 4 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
17c47e8
Add 'indirect_sort'
mclow 814f8a5
Update docs based on feedback, add 'stable_sort', 'partial_sort' and …
mclow 3ae9ee2
Add more tests
mclow 8be54b3
Split test cases
mclow a7ae53d
Update docs; fix copy-pasta
mclow 25ab833
Replace call to 'iota' with hand-rolled loop
mclow 62922bd
Replace hand-rolled loop with 'iota_n' and back_insert_iterator
mclow File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,111 @@ | ||
[/ File indirect_sort.qbk] | ||
|
||
[section:indirect_sort indirect_sort ] | ||
|
||
[/license | ||
Copyright (c) 2023 Marshall Clow | ||
|
||
Distributed under the Boost Software License, Version 1.0. | ||
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | ||
] | ||
|
||
There are times that you want a sorted version of a sequence, but for some reason you don't want to modify it. Maybe the elements in the sequence can't be moved/copied, e.g. the sequence is const, or they're just really expensive to move around. An example of this might be a sequence of records from a database. | ||
|
||
That's where indirect sorting comes in. In a "normal" sort, the elements of the sequence to be sorted are shuffled in place. In indirect sorting, the elements are unchanged, but the sort algorithm returns a "permutation" of the elements that, when applied, will put the elements in the sequence in a sorted order. | ||
|
||
Assume have a sequence `[first, last)` of 1000 items that are expensive to swap: | ||
``` | ||
std::sort(first, last); // ['O(N ln N)] comparisons and ['O(N ln N)] swaps (of the element type). | ||
``` | ||
|
||
On the other hand, using indirect sorting: | ||
``` | ||
auto perm = indirect_sort(first, last); // ['O(N lg N)] comparisons and ['O(N lg N)] swaps (of size_t). | ||
apply_permutation(first, last, perm.begin(), perm.end()); // ['O(N)] swaps (of the element type) | ||
``` | ||
|
||
If the element type is sufficiently expensive to swap, then 10,000 swaps of size_t + 1000 swaps of the element_type could be cheaper than 10,000 swaps of the element_type. | ||
|
||
Or maybe you don't need the elements to actually be sorted - you just want to traverse them in a sorted order: | ||
``` | ||
auto permutation = indirect_sort(first, last); | ||
for (size_t idx: permutation) | ||
std::cout << first[idx] << std::endl; | ||
``` | ||
|
||
|
||
Assume that instead of an "array of structures", you have a "struct of arrays". | ||
``` | ||
struct AType { | ||
Type0 key; | ||
Type1 value1; | ||
Type1 value2; | ||
}; | ||
|
||
std::array<AType, 1000> arrayOfStruct; | ||
``` | ||
|
||
versus: | ||
|
||
``` | ||
template <size_t N> | ||
struct AType { | ||
std::array<Type0, N> key; | ||
std::array<Type1, N> value1; | ||
std::array<Type2, N> value2; | ||
}; | ||
|
||
AType<1000> structOfArrays; | ||
``` | ||
|
||
Sorting the first one is easy, because each set of fields (`key`, `value1`, `value2`) are part of the same struct. But with indirect sorting, the second one is easy to sort as well - just sort the keys, then apply the permutation to the keys and the values: | ||
``` | ||
auto perm = indirect_sort(std::begin(structOfArrays.key), std::end(structOfArrays.key)); | ||
apply_permutation(structOfArrays.key.begin(), structOfArrays.key.end(), perm.begin(), perm.end()); | ||
apply_permutation(structOfArrays.value1.begin(), structOfArrays.value1.end(), perm.begin(), perm.end()); | ||
apply_permutation(structOfArrays.value2.begin(), structOfArrays.value2.end(), perm.begin(), perm.end()); | ||
``` | ||
|
||
[heading interface] | ||
|
||
The function `indirect_sort` returns a `vector<size_t>` containing the permutation necessary to put the input sequence into a sorted order. One version uses `std::less` to do the comparisons; the other lets the caller pass predicate to do the comparisons. | ||
|
||
There is also a variant called `indirect_stable_sort`; it bears the same relation to `indirect_sort` that `std::stable_sort` does to `std::sort`. | ||
|
||
``` | ||
template <typename RAIterator> | ||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last); | ||
|
||
template <typename RAIterator, typename BinaryPredicate> | ||
std::vector<size_t> indirect_sort (RAIterator first, RAIterator last, BinaryPredicate pred); | ||
|
||
template <typename RAIterator> | ||
std::vector<size_t> indirect_stable_sort (RAIterator first, RAIterator last); | ||
|
||
template <typename RAIterator, typename BinaryPredicate> | ||
std::vector<size_t> indirect_stable_sort (RAIterator first, RAIterator last, BinaryPredicate pred); | ||
``` | ||
|
||
[heading Examples] | ||
|
||
[heading Iterator Requirements] | ||
|
||
`indirect_sort` requires random-access iterators. | ||
|
||
[heading Complexity] | ||
|
||
Both of the variants of `indirect_sort` run in ['O(N lg N)] time; they are not more (or less) efficient than `std::sort`. There is an extra layer of indirection on each comparison, but all of the swaps are done on values of type `size_t` | ||
|
||
[heading Exception Safety] | ||
|
||
[heading Notes] | ||
|
||
In numpy, this algorithm is known as `argsort`. | ||
|
||
[endsect] | ||
|
||
[/ File indirect_sort.qbk | ||
Copyright 2023 Marshall Clow | ||
Distributed under the Boost Software License, Version 1.0. | ||
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt). | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,192 @@ | ||
/* | ||
Copyright (c) Marshall Clow 2023. | ||
|
||
Distributed under the Boost Software License, Version 1.0. (See accompanying | ||
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) | ||
|
||
*/ | ||
|
||
/// \file indirect_sort.hpp | ||
/// \brief indirect sorting algorithms | ||
/// \author Marshall Clow | ||
/// | ||
|
||
#ifndef BOOST_ALGORITHM_INDIRECT_SORT | ||
#define BOOST_ALGORITHM_INDIRECT_SORT | ||
|
||
#include <algorithm> // for std::sort (and others) | ||
#include <functional> // for std::less | ||
#include <vector> // for std::vector | ||
|
||
#include <boost/algorithm/cxx11/iota.hpp> | ||
|
||
namespace boost { namespace algorithm { | ||
|
||
namespace detail { | ||
|
||
template <class Predicate, class Iter> | ||
struct indirect_predicate { | ||
indirect_predicate (Predicate pred, Iter iter) | ||
: pred_(pred), iter_(iter) {} | ||
|
||
bool operator ()(size_t a, size_t b) const { | ||
return pred_(iter_[a], iter_[b]); | ||
} | ||
|
||
Predicate pred_; | ||
Iter iter_; | ||
}; | ||
|
||
} | ||
|
||
typedef std::vector<size_t> Permutation; | ||
|
||
// ===== sort ===== | ||
|
||
/// \fn indirect_sort (RAIterator first, RAIterator last, Predicate p) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted according to the predicate pred. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param last The end of the input sequence | ||
/// \param pred The predicate to compare elements with | ||
/// | ||
template <typename RAIterator, typename Pred> | ||
Permutation indirect_sort (RAIterator first, RAIterator last, Pred pred) { | ||
Permutation ret(std::distance(first, last)); | ||
boost::algorithm::iota(ret.begin(), ret.end(), size_t(0)); | ||
std::sort(ret.begin(), ret.end(), | ||
detail::indirect_predicate<Pred, RAIterator>(pred, first)); | ||
return ret; | ||
} | ||
|
||
/// \fn indirect_sort (RAIterator first, RAIterator last) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted in non-descending order. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param last The end of the input sequence | ||
/// | ||
template <typename RAIterator> | ||
Permutation indirect_sort (RAIterator first, RAIterator last) { | ||
return indirect_sort(first, last, | ||
std::less<typename std::iterator_traits<RAIterator>::value_type>()); | ||
} | ||
|
||
// ===== stable_sort ===== | ||
|
||
/// \fn indirect_stable_sort (RAIterator first, RAIterator last, Predicate p) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted according to the predicate pred. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param last The end of the input sequence | ||
/// \param pred The predicate to compare elements with | ||
/// | ||
template <typename RAIterator, typename Pred> | ||
Permutation indirect_stable_sort (RAIterator first, RAIterator last, Pred pred) { | ||
Permutation ret(std::distance(first, last)); | ||
boost::algorithm::iota(ret.begin(), ret.end(), size_t(0)); | ||
std::stable_sort(ret.begin(), ret.end(), | ||
detail::indirect_predicate<Pred, RAIterator>(pred, first)); | ||
return ret; | ||
} | ||
|
||
/// \fn indirect_stable_sort (RAIterator first, RAIterator last) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted in non-descending order. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param last The end of the input sequence | ||
/// | ||
template <typename RAIterator> | ||
Permutation indirect_stable_sort (RAIterator first, RAIterator last) { | ||
return indirect_stable_sort(first, last, | ||
std::less<typename std::iterator_traits<RAIterator>::value_type>()); | ||
} | ||
|
||
// ===== partial_sort ===== | ||
|
||
/// \fn indirect_partial_sort (RAIterator first, RAIterator last, Predicate p) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the resulting range [first, middle) is sorted and the range [middle,last) | ||
/// consists of elements that are "larger" than then ones in [first, middle), | ||
/// according to the predicate pred. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param middle The end of the range to be sorted | ||
/// \param last The end of the input sequence | ||
/// \param pred The predicate to compare elements with | ||
/// | ||
template <typename RAIterator, typename Pred> | ||
Permutation indirect_partial_sort (RAIterator first, RAIterator middle, | ||
RAIterator last, Pred pred) { | ||
Permutation ret(std::distance(first, last)); | ||
|
||
boost::algorithm::iota(ret.begin(), ret.end(), size_t(0)); | ||
std::partial_sort(ret.begin(), ret.begin() + std::distance(first, middle), ret.end(), | ||
detail::indirect_predicate<Pred, RAIterator>(pred, first)); | ||
return ret; | ||
} | ||
|
||
/// \fn indirect_partial_sort (RAIterator first, RAIterator last) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the resulting range [first, middle) is sorted in non-descending order, | ||
/// and the range [middle,last) consists of elements that are larger than | ||
/// then ones in [first, middle). | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param last The end of the input sequence | ||
/// | ||
template <typename RAIterator> | ||
Permutation indirect_partial_sort (RAIterator first, RAIterator middle, RAIterator last) { | ||
return indirect_partial_sort(first, middle, last, | ||
std::less<typename std::iterator_traits<RAIterator>::value_type>()); | ||
} | ||
|
||
// ===== nth_element ===== | ||
|
||
/// \fn indirect_nth_element (RAIterator first, RAIterator last, Predicate p) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted according to the predicate pred. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar C&P mistake in signature and description. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice catch - thanks! |
||
/// | ||
/// \param first The start of the input sequence | ||
/// \param nth The sort partition point in the input sequence | ||
/// \param last The end of the input sequence | ||
/// \param pred The predicate to compare elements with | ||
/// | ||
template <typename RAIterator, typename Pred> | ||
Permutation indirect_nth_element (RAIterator first, RAIterator nth, | ||
RAIterator last, Pred pred) { | ||
Permutation ret(std::distance(first, last)); | ||
boost::algorithm::iota(ret.begin(), ret.end(), size_t(0)); | ||
std::nth_element(ret.begin(), ret.begin() + std::distance(first, nth), ret.end(), | ||
detail::indirect_predicate<Pred, RAIterator>(pred, first)); | ||
return ret; | ||
} | ||
|
||
/// \fn indirect_nth_element (RAIterator first, RAIterator last) | ||
/// \returns a permutation of the elements in the range [first, last) | ||
/// such that when the permutation is applied to the sequence, | ||
/// the result is sorted in non-descending order. | ||
/// | ||
/// \param first The start of the input sequence | ||
/// \param nth The sort partition point in the input sequence | ||
/// \param last The end of the input sequence | ||
/// | ||
template <typename RAIterator> | ||
Permutation indirect_nth_element (RAIterator first, RAIterator nth, RAIterator last) { | ||
return indirect_nth_element(first, nth, last, | ||
std::less<typename std::iterator_traits<RAIterator>::value_type>()); | ||
} | ||
|
||
}} | ||
|
||
#endif // BOOST_ALGORITHM_INDIRECT_SORT |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C&P mistake in the signature inside this (and below) docstrings