Skip to content

Commit 823e059

Browse files
committed
First draft of Boost.Algorithm documentation; more to come
[SVN r77326]
1 parent e3263d4 commit 823e059

15 files changed

+1068
-4
lines changed

doc/Jamfile.v2

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Boost.Algorithm
2+
#
3+
# Copyright (c) 2010-2012 Marshall Clow
4+
#
5+
# Distributed under the Boost Software License, Version 1.0.
6+
# (See accompanying file LICENSE_1_0.txt or copy at
7+
# http://www.boost.org/LICENSE_1_0.txt)
8+
9+
10+
# Quickbook
11+
# -----------------------------------------------------------------------------
12+
13+
import os ;
14+
15+
using quickbook ;
16+
using doxygen ;
17+
using boostbook ;
18+
19+
local BOOST_ROOT = [ os.environ BOOST_ROOT ] ;
20+
21+
doxygen autodoc
22+
:
23+
[ glob ../../../boost/algorithm/*.hpp ../../../boost/algorithm/searching/*.hpp ]
24+
:
25+
<doxygen:param>"PREDEFINED=\"BOOST_ALGORITHM_DOXYGEN=1\""
26+
<doxygen:param>WARNINGS=YES # Default NO, but useful to see warnings, especially in a logfile.
27+
;
28+
29+
30+
xml algorithm : algorithm.qbk ;
31+
32+
# path-constant boost-images : $(BOOST_ROOT)/doc/src/images ;
33+
34+
boostbook standalone
35+
:
36+
algorithm
37+
:
38+
<dependency>autodoc
39+
<xsl:param>boost.root=$(BOOST_ROOT)
40+
<xsl:param>"boost.doxygen.reftitle=Boost.Algorithms C++ Reference"
41+
<xsl:param>chapter.autolabel=0
42+
<xsl:param>chunk.section.depth=8
43+
<xsl:param>toc.section.depth=2
44+
<xsl:param>toc.max.depth=2
45+
<xsl:param>generate.section.toc.level=1
46+
;

doc/algorithm.qbk

+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
[library The Boost Algorithm Library
2+
[quickbook 1.5]
3+
[id algorithm]
4+
[dirname algorithm]
5+
[purpose Library of useful algorithms]
6+
[category algorithms]
7+
[authors [Clow, Marshall]]
8+
[copyright 2010-2012 Marshall Clow]
9+
[source-mode c++]
10+
[license
11+
Distributed under the Boost Software License, Version 1.0.
12+
(See accompanying file LICENSE_1_0.txt or copy at
13+
[@http://www.boost.org/LICENSE_1_0.txt])
14+
]
15+
]
16+
17+
[section Description and Rationale]
18+
19+
Boost.Algorithm is a collection of general purpose algorithms. While Boost contains many libraries of data structures, there is no single library for general purpose algorithms. Even though the algorithms are generally useful, many tend to be thought of as "too small" for Boost.
20+
21+
An implementation of Boyer-Moore searching, for example, might take a developer a week or so to implement, including test cases and documentation. However, scheduling a review to include that code into Boost might take several months, and run into resistance because "it is too small". Nevertheless, a library of tested, reviewed, documented algorithms can make the developer's life much easier, and that is the purpose of this library.
22+
23+
[heading Future plans]
24+
25+
I will be soliciting submissions from other developers, as well as looking through the literature for existing algorithms to include. The Adobe Source Library, for example, contains many useful algorithms that already have documentation and test cases. Knuth's _The Art of Computer Programming_ is chock-full of algorithm descriptions, too.
26+
27+
My goal is to run regular algorithm reviews, similar to the Boost library review process, but with smaller chunks of code.
28+
29+
[heading Dependencies]
30+
31+
Boost.Algorithm uses Boost.Range, Boost.Assert, Boost.Array, Boost.TypeTraits, and Boost.StaticAssert.
32+
33+
34+
[heading Acknowledgements]
35+
36+
Thanks to all the people who have reviewed this library and made suggestions for improvements. Steven Watanabe and Sean Parent, in particular, have provided a great deal of help.
37+
38+
[endsect]
39+
40+
[/ include toc.qbk]
41+
42+
43+
[section:Searching Searching Algorithms]
44+
[include boyer_moore.qbk]
45+
[include boyer_moore_horspool.qbk]
46+
[include knuth_morris_pratt.qbk]
47+
[endsect]
48+
49+
[section:CXX11 C++11 Algorithms]
50+
[include all_of.qbk]
51+
[include any_of.qbk]
52+
[include none_of.qbk]
53+
[include one_of.qbk]
54+
[include ordered-hpp.qbk]
55+
[include is_partitioned.qbk]
56+
[include partition_point.qbk]
57+
[endsect]
58+
59+
[section:Misc Other Algorithms]
60+
[include clamp-hpp.qbk]
61+
[include hex.qbk]
62+
[endsect]
63+
64+
65+
66+
[xinclude autodoc.xml]
67+
68+

doc/all_of.qbk

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
[/ File all_of.qbk]
2+
3+
[section:all_of all_of]
4+
5+
[/license
6+
Copyright (c) 2010-2012 Marshall Clow
7+
8+
Distributed under the Boost Software License, Version 1.0.
9+
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
10+
]
11+
12+
The header file 'boost/algorithm/cxx11/all_of.hpp' contains four variants of a single algorithm, `all_of`. The algorithm tests all the elements of a sequence and returns true if they all share a property.
13+
14+
The routine `all_of` takes a sequence and a predicate. It will return true if the predicate returns true when applied to every element in the sequence.
15+
16+
The routine `all_of_equal` takes a sequence and a value. It will return true if every element in the sequence compares equal to the passed in value.
17+
18+
Both routines come in two forms; the first one takes two iterators to define the range. The second form takes a single range parameter, and uses Boost.Range to traverse it.
19+
20+
21+
[heading interface]
22+
23+
The function `all_of` returns true if the predicate returns true for every item in the sequence. There are two versions; one takes two iterators, and the other takes a range.
24+
25+
``
26+
namespace boost { namespace algorithm {
27+
template<typename InputIterator, typename Predicate>
28+
bool all_of ( InputIterator first, InputIterator last, Predicate p );
29+
template<typename Range, typename Predicate>
30+
bool all_of ( const Range &r, Predicate p );
31+
}}
32+
``
33+
34+
The function `all_of_equal` is similar to `all_of`, but instead of taking a predicate to test the elements of the sequence, it takes a value to compare against.
35+
36+
``
37+
namespace boost { namespace algorithm {
38+
template<typename InputIterator, typename V>
39+
bool all_of_equal ( InputIterator first, InputIterator last, V const &val );
40+
template<typename Range, typename V>
41+
bool all_of_equal ( const Range &r, V const &val );
42+
}}
43+
``
44+
45+
[heading Examples]
46+
47+
Given the container `c` containing `{ 0, 1, 2, 3, 14, 15 }`, then
48+
``
49+
bool isOdd ( int i ) { return i % 2 == 1; }
50+
bool lessThan10 ( int i ) { return i < 10; }
51+
52+
using boost::algorithm;
53+
all_of ( c, isOdd ) --> false
54+
all_of ( c.begin (), c.end (), lessThan10 ) --> false
55+
all_of ( c.begin (), c.begin () + 3, lessThan10 ) --> true
56+
all_of ( c.end (), c.end (), isOdd ) --> true // empty range
57+
all_of_equal ( c, 3 ) --> false
58+
all_of_equal ( c.begin () + 3, c.begin () + 4, 3 ) --> true
59+
all_of_equal ( c.begin (), c.begin (), 99 ) --> true // empty range
60+
``
61+
62+
[heading Iterator Requirements]
63+
64+
`all_of` and `all_of_equal` work on all iterators except output iterators.
65+
66+
[heading Complexity]
67+
68+
All of the variants of `all_of` and `all_of_equal` run in ['O(N)] (linear) time; that is, they compare against each element in the list once. If any of the comparisons fail, the algorithm will terminate immediately, without examining the remaining members of the sequence.
69+
70+
[heading Exception Safety]
71+
72+
All of the variants of `all_of` and `all_of_equal` take their parameters by value or const reference, and do not depend upon any global state. Therefore, all the routines in this file provide the strong exception guarantee.
73+
74+
[heading Notes]
75+
76+
* The routine `all_of` is part of the C++11 standard. When compiled using a C++11 implementation, the implementation from the standard library will be used.
77+
78+
* `all_of` and `all_of_equal` both return true for empty ranges, no matter what is passed to test against. When there are no items in the sequence to test, they all satisfy the condition to be tested against.
79+
80+
* The second parameter to `all_of_value` is a template parameter, rather than deduced from the first parameter (`std::iterator_traits<InputIterator>::value_type`) because that allows more flexibility for callers, and takes advantage of built-in comparisons for the type that is pointed to by the iterator. The function is defined to return true if, for all elements in the sequence, the expression `*iter == val` evaluates to true (where `iter` is an iterator to each element in the sequence)
81+
82+
[endsect]
83+
84+
[/ File all_of.qbk
85+
Copyright 2011 Marshall Clow
86+
Distributed under the Boost Software License, Version 1.0.
87+
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt).
88+
]
89+

doc/any_of.qbk

+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
[/ File any_of.qbk]
2+
3+
[section:any_of any_of]
4+
5+
[/license
6+
Copyright (c) 2010-2012 Marshall Clow
7+
8+
Distributed under the Boost Software License, Version 1.0.
9+
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
10+
]
11+
12+
The header file 'boost/algorithm/cxx11/any_of.hpp' contains four variants of a single algorithm, `any_of`. The algorithm tests the elements of a sequence and returns true if any of the elements has a particular property.
13+
14+
The routine `any_of` takes a sequence and a predicate. It will return true if the predicate returns true for any element in the sequence.
15+
16+
The routine `any_of_equal` takes a sequence and a value. It will return true if any element in the sequence compares equal to the passed in value.
17+
18+
Both routines come in two forms; the first one takes two iterators to define the range. The second form takes a single range parameter, and uses Boost.Range to traverse it.
19+
20+
21+
[heading interface]
22+
23+
The function `any_of` returns true if the predicate returns true any item in the sequence. There are two versions; one takes two iterators, and the other takes a range.
24+
25+
``
26+
namespace boost { namespace algorithm {
27+
template<typename InputIterator, typename Predicate>
28+
bool any_of ( InputIterator first, InputIterator last, Predicate p );
29+
template<typename Range, typename Predicate>
30+
bool any_of ( const Range &r, Predicate p );
31+
}}
32+
``
33+
34+
The function `any_of_equal` is similar to `any_of`, but instead of taking a predicate to test the elements of the sequence, it takes a value to compare against.
35+
36+
``
37+
namespace boost { namespace algorithm {
38+
template<typename InputIterator, typename V>
39+
bool any_of_equal ( InputIterator first, InputIterator last, V const &val );
40+
template<typename Range, typename V>
41+
bool any_of_equal ( const Range &r, V const &val );
42+
}}
43+
``
44+
45+
[heading Examples]
46+
47+
Given the container `c` containing `{ 0, 1, 2, 3, 14, 15 }`, then
48+
``
49+
bool isOdd ( int i ) { return i % 2 == 1; }
50+
bool lessThan10 ( int i ) { return i < 10; }
51+
52+
using boost::algorithm;
53+
any_of ( c, isOdd ) --> true
54+
any_of ( c.begin (), c.end (), lessThan10 ) --> true
55+
any_of ( c.begin () + 4, c.end (), lessThan10 ) --> false
56+
any_of ( c.end (), c.end (), isOdd ) --> false // empty range
57+
any_of_equal ( c, 3 ) --> true
58+
any_of_equal ( c.begin (), c.begin () + 3, 3 ) --> false
59+
any_of_equal ( c.begin (), c.begin (), 99 ) --> false // empty range
60+
``
61+
62+
[heading Iterator Requirements]
63+
64+
`any_of` and `any_of_equal` work on all iterators except output iterators.
65+
66+
[heading Complexity]
67+
68+
All of the variants of `any_of` and `any_of_equal` run in ['O(N)] (linear) time; that is, they compare against each element in the list once. If any of the comparisons succeed, the algorithm will terminate immediately, without examining the remaining members of the sequence.
69+
70+
[heading Exception Safety]
71+
72+
All of the variants of `any_of` and `any_of_equal` take their parameters by value or const reference, and do not depend upon any global state. Therefore, all the routines in this file provide the strong exception guarantee.
73+
74+
[heading Notes]
75+
76+
* The routine `any_of` is part of the C++11 standard. When compiled using a C++11 implementation, the implementation from the standard library will be used.
77+
78+
* `any_of` and `any_of_equal` both return false for empty ranges, no matter what is passed to test against.
79+
80+
* The second parameter to `any_of_value` is a template parameter, rather than deduced from the first parameter (`std::iterator_traits<InputIterator>::value_type`) because that allows more flexibility for callers, and takes advantage of built-in comparisons for the type that is pointed to by the iterator. The function is defined to return true if, for any element in the sequence, the expression `*iter == val` evaluates to true (where `iter` is an iterator to each element in the sequence)
81+
82+
[endsect]
83+
84+
[/ File any_of.qbk
85+
Copyright 2011 Marshall Clow
86+
Distributed under the Boost Software License, Version 1.0.
87+
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt).
88+
]
89+

doc/boyer_moore.qbk

+93
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
[/ QuickBook Document version 1.5 ]
2+
3+
[section:BoyerMoore Boyer-Moore Search]
4+
5+
[/license
6+
7+
Copyright (c) 2010-2012 Marshall Clow
8+
9+
Distributed under the Boost Software License, Version 1.0.
10+
(See accompanying file LICENSE_1_0.txt or copy at
11+
http://www.boost.org/LICENSE_1_0.txt)
12+
]
13+
14+
15+
[heading Overview]
16+
17+
The header file 'boyer_moore.hpp' contains an an implementation of the Boyer-Moore algorithm for searching sequences of values.
18+
19+
The Boyer–Moore string search algorithm is a particularly efficient string searching algorithm, and it has been the standard benchmark for the practical string search literature. The Boyer-Moore algorithm was invented by Bob Boyer and J. Strother Moore, and published in the October 1977 issue of the Communications of the ACM , and a copy of that article is available at [@http://www.cs.utexas.edu/~moore/publications/fstrpos.pdf].
20+
21+
The Boyer-Moore algorithm uses two precomputed tables to give better performance than a naive search. These tables depend on the pattern being searched for, and give the Boyer-Moore algorithm larger a memory footprint and startup costs than a simpler algorithm, but these costs are recovered quickly during the searching process, especially if the pattern is longer than a few elements.
22+
23+
However, the Boyer-Moore algorithm cannot be used with comparison predicates like `std::search`.
24+
25+
Nomenclature: I refer to the sequence being searched for as the "pattern", and the sequence being searched in as the "corpus".
26+
27+
For flexibility, the Boyer-Moore algorithm has has two interfaces; an object-based interface and a procedural one. The object-based interface builds the tables in the constructor, and uses operator () to perform the search. The procedural interface builds the table and does the search all in one step. If you are going to be searching for the same pattern in multiple corpora, then you should use the object interface, and only build the tables once.
28+
29+
Here is the object interface:
30+
``
31+
template <typename patIter>
32+
class boyer_moore {
33+
public:
34+
boyer_moore ( patIter first, patIter last );
35+
~boyer_moore ();
36+
37+
template <typename corpusIter>
38+
corpusIter operator () ( corpusIter corpus_first, corpusIter corpus_last );
39+
};
40+
``
41+
42+
and here is the corresponding procedural interface:
43+
44+
``
45+
template <typename patIter, typename corpusIter>
46+
corpusIter boyer_moore_search (
47+
corpusIter corpus_first, corpusIter corpus_last,
48+
patIter pat_first, patIter pat_last );
49+
``
50+
51+
Each of the functions is passed two pairs of iterators. The first two define the corpus and the second two define the pattern. Note that the two pairs need not be of the same type, but they do need to "point" at the same type. In other words, `patIter::value_type` and `curpusIter::value_type` need to be the same type.
52+
53+
The return value of the function is an iterator pointing to the start of the pattern in the corpus. If the pattern is not found, it returns the end of the corpus (`corpus_last`).
54+
55+
[heading Performance]
56+
57+
The execution time of the Boyer-Moore algorithm, while still linear in the size of the string being searched, can have a significantly lower constant factor than many other search algorithms: it doesn't need to check every character of the string to be searched, but rather skips over some of them. Generally the algorithm gets faster as the pattern being searched for becomes longer. Its efficiency derives from the fact that with each unsuccessful attempt to find a match between the search string and the text it is searching, it uses the information gained from that attempt to rule out as many positions of the text as possible where the string cannot match.
58+
59+
[heading Memory Use]
60+
61+
The algorithm allocates two internal tables. The first one is proportional to the length of the pattern; the second one has one entry for each member of the "alphabet" in the pattern. For (8-bit) character types, this table contains 256 entries.
62+
63+
[heading Complexity]
64+
65+
The worst-case performance to find a pattern in the corpus is ['O(N)] (linear) time; that is, proportional to the length of the corpus being searched. In general, the search is sub-linear; not every entry in the corpus need be checked.
66+
67+
[heading Exception Safety]
68+
69+
Both the object-oriented and procedural versions of the Boyer-Moore algorithm take their parameters by value and do not use any information other than what is passed in. Therefore, both interfaces provide the strong exception guarantee.
70+
71+
[heading Notes]
72+
73+
* When using the object-based interface, the pattern must remain unchanged for during the searches; i.e, from the time the object is constructed until the final call to operator () returns.
74+
75+
* The Boyer-Moore algorithm requires random-access iterators for both the pattern and the corpus.
76+
77+
[heading Customization points]
78+
79+
The Boyer-Moore object takes a traits template parameter which enables the caller to customize how one of the precomputed tables is stored. This table, called the skip table, contains (logically) one entry for every possible value that the pattern can contain. When searching 8-bit character data, this table contains 256 elements. The traits class defines the table to be used.
80+
81+
The default traits class uses a `boost::array` for small 'alphabets' and a `tr1::unordered_map` for larger ones. The array-based skip table gives excellent performance, but could be prohibitively large when the "alphabet" of elements to be searched grows. The unordered_map based version only grows as the number of unique elements in the pattern, but makes many more heap allocations, and gives slower lookup performance.
82+
83+
To use a different skip table, you should define your own skip table object and your own traits class, and use them to instantiate the Boyer-Moore object. The interface to these objects is described TBD.
84+
85+
86+
[endsect]
87+
88+
[/ File boyer_moore.qbk
89+
Copyright 2011 Marshall Clow
90+
Distributed under the Boost Software License, Version 1.0.
91+
(See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt).
92+
]
93+

0 commit comments

Comments
 (0)